=Paper= {{Paper |id=Vol-2290/kars2018_paper2 |storemode=property |title=Knowledge Graph-based Weighting Strategies for a Scholarly Paper Recommendation Scenario |pdfUrl=https://ceur-ws.org/Vol-2290/kars2018_paper2.pdf |volume=Vol-2290 |authors=Rubén Manrique,Olga Marino |dblpUrl=https://dblp.org/rec/conf/recsys/ManriqueM18 }} ==Knowledge Graph-based Weighting Strategies for a Scholarly Paper Recommendation Scenario== https://ceur-ws.org/Vol-2290/kars2018_paper2.pdf

Knowledge graph-based weighting strategies for a scholarly
paper recommendation scenario
Rubén Manrique Olga Marino
Systems and Computing Engineering Department, School Systems and Computing Engineering Department, School
of Engineering. Universidad de los Andes of Engineering. Universidad de los Andes
Bogotá, Colombia Bogotá, Colombia
rf.manrique@uniandes.edu.co olmarino@uniandes.edu.co

ABSTRACT the knowledge of multiple domains, and in general, they specify a
In this paper, we study the effects of different node and edge weight- large number of interrelationships between concepts [12].
ing strategies of graph-based semantic representations on the accu- In previous works by the authors, a semantic representation that
racy of a scholarly paper recommendation scenario. Our semantic exploits the structured knowledge present in KGs for the task of rec-
representation relies on the use of Knowledge Graphs (KGs) for ommending scholarly papers was proposed [5–7]. In these works,
acquiring relevant additional information about concepts and their the representations of both user profiles and scholarly papers are
semantic relations, thus resulting in a knowledge-rich graph doc- based on the concepts mentions found in the paper’s content plus
ument model. Recent studies have used this representation as the additional processes that consider the semantic relation of the con-
basis of a scholarly paper recommendation system. Even when the cepts found in the KG. The resulting semantic representation is,
recommendation is made based on the comparison of graphs, little in essence, a directed weighted graph whose nodes represent con-
has been explored regarding the effects of the weights assigned to cepts and whose edges express the existence of a semantic linkage
the edges and nodes in the representation. In this paper, we present between two concepts in the KG. The similarity score between a
the initial results obtained from a comparative study of the effects user profile and a document is computed via a graph similarity al-
of different weighting strategies on the quality of the recommen- gorithm. Compared with standard content-based recommendation
dations. Three weighting strategies for edges (Number of Paths systems that look at documents and user profiles as bags of terms
(NP), Semantic Connectivity Score (SCS), and Hierarchical Similar- (i.e. keyword based representations), results show the superiority
ity (HS)) and two for nodes (Concept Frequency (CF) and PageRank of the semantic representation [5–7].
(PR)) are considered. Results show that the combination of the SCS In the proposed semantic representation weights are assigned to
and CF outperform the other weighting strategy combinations and both nodes and edges. The weights of the nodes consider the im-
the considered baselines. portance of the concept in the document, while the edges’ weights
capture the degree of associativity between concepts in the KG. Al-
KEYWORDS though existing graph similarity measures can exploit these weights,
the effect of different weighting strategies in the recommendation
Knowledge-based graph representations, scholarly papers recom-
has not been fully explored. In this paper, we present initial results
mendation, semantics-aware recommender system
of a comparative study on the recommendation performance using
ACM Reference Format: different weighting strategies for a graph-based representation. The
Rubén Manrique and Olga Marino. 2018. Knowledge graph-based weighting evaluation is done using a scholarly paper recommendation dataset
strategies for a scholarly paper recommendation scenario. In Proceedings of that contains the user profiles of eleven professors of computer
Knowledge-aware and Conversational Recommender Systems (KaRS) Work-
science [7].
shop 2018 (co-located with RecSys 2018) (KaRS’18). ACM, New York, NY, USA,
4 pages.
The paper is organized as follows: Section 2 reviews some related
work in semantics-aware recommender systems. Section 3 presents
the semantic representation process and its diverse modules. Sec-
1 INTRODUCTION
tions 4 and 5 present the considered weighting functions. Section
In recent years and with the advent of the Semantic Web and 6 describes the evaluation framework and Section 7 the results.
Linked Open Data (LOD), new semantics-aware content representa- Finally, conclusions and directions for future work are discussed in
tions have been proposed for the next generation of recommender Section 8.
systems[4, 8]. LOD provides a variety of structured knowledge
bases in RDF format that are freely accessible on the Web and in- 2 RELATED WORK
terconnected to each other. As a result, there is a large amount of
machine-readable data available that could be exploited to build Recent contributions to semantics-aware recommender systems
more intelligent systems [8]. From this huge amount of data, Knowl- have focused on exogenous semantic representations that intro-
edge Graphs (KGs) are particularly important since they concentrate duce the semantics by linking the recommendation item to a KG
[3, 4, 8, 9]. The ESWC 2014 Challenge [2], for example, worked on
KaRS’18, October 7, 2018, Vancouver, Canada. books that were mapped to their corresponding DBpedia resource.
2018. ACM ISBN Copyright for the individual papers remains with the authors. Copying In scenarios where there is not a broad enough open knowledge
permitted for private and academic purposes. This volume is published and copyrighted
by its editors.. base that describes items or where most of the important informa-
tion about the item is encoded via textual content, these approaches
KaRS’18, October 7, 2018, Vancouver, Canada. Rubén Manrique and Olga Marino

cannot be easily used. In such cases, an alternative approach is to
build semantic representations by processing textual content and
mapping the concept mentions found to a KG via entity linking
tools in such a way that the item is represented by the multiple
concepts found. Piao et al. explore this approach for personalized
link recommendations on Twitter [13, 14] and more recently Man-
rique et al. for a scholarly paper recommendation task [6, 7]. In
this paper, we are also using this type of semantic representation,
however instead of using a vector representation of concepts we
use a graph-based representation. Therefore, we rely on a graph
similarity strategy to rank candidate items.

3 SEMANTIC REPRESENTATION
To build the semantic representation, we begin with the identifica-
tion of concepts mentions in the text (i.e. annotations). Different
tools for entity linking and word sense disambiguation can be used
for this task. As an example, consider the short input text in Figure 1.
After using an automatic annotation tool a set of URIs correspond-
ing to KG concepts is returned. It is important to mention that no
human verification is performed on the set of annotations retrieved
by the automatic tool. Then, expansion and filtering processes that
consider the semantic relationships found in the KG are applied.
The expansion process is used to enrich the representation with
concepts that are not explicitly mentioned in the text or not iden-
tified by the annotation tool but are strongly related with anno-
tations. Our previous results show that expanded concepts can
be important to reinforce the main topic of the document even if
they do not occur in the text. The expansion process adds more
discriminative power to the representation. We expand the set of Figure 1: Semantic representation process description
annotations to new related ones following two different approaches:
category-based and property-based. The category-based expansion
the importance of each concept and the strength of the edges are
incorporates the hierarchical information of the concepts. For the
evaluated via different weighting functions (see Sections 4 and 5).
“Artificial_Intelligence” concept, for example, the category “Cate-
The resulting representation follows Definition 1.
gory:Computational_neuroscience” is retrieved and incorporated
Definition 1. The semantic representation G i of a profile/document
into the representation. For property-based expansion, the KG on-
pi is a directed weighted graph G i = (Ni , Ei , w(c), w(e)), where
tology is navigated and a set of related concepts is incorporated.
both nodes and edges have an associated weight defined by the
For example, from the “Robotics” annotation the concept “Cooper-
functions w(c) : N → + and w(e) : E → +. The set of nodes
ation” is retrieved by following the “wikiPageWikiLink” property
Ni = {c 1 , c 2 , ..., c k } are entities/concepts belonging to the space of
in the ontology. Only outgoing links from a given annotation are
a KG. The node weight w(c) denotes how relevant the node c is for
considered.
the profile/document. An edge between two nodes (c a , cb ) repre-
The filtering strategy seeks to eliminate irrelevant and possible
sents the existence of at least one statement in the KG that links
noisy concepts in the representation. The noisy concepts can be the
both concepts. The weight of the edge w(e) denotes how strong
result of incorrect annotations, off-topic concepts found in the text
this linkage is between the considered concepts.
or added in the expansion step [6]. We found that noisy concepts
tend to be disconnected (i.e. a low number of connections with
4 EDGE WEIGHTING FUNCTIONS
other concepts). Property paths1 between every pair of concepts Íτ
l =1 |paths c ,c |
are analyzed and constitute the base information for the graph edge Number of paths (NP). NP is defined as N P(c i , c j ) = MP
i j
,
conformation. Basically, an edge is created if there is a property
where |pathsc i ,c j | is the number of paths of length l between con-
path between the given pair of concepts. Then, the filtering strategy cepts c i and c j in the KG, and τ is the maximum length of path
eliminates concepts with a node degree below or equal to α. considered. MP is a normalization parameter that is equal to the
Different property path lengths between concepts can be consid- maximum number of paths found for a pair of concepts in the
ered in the filtering process, so different semantic representations representation.
can be produced for the same input text. The result of these pro- The Semantic Connectivity Score (SCS) [10]. SCS measures
cesses is a graph whose nodes are concepts and edges express the latent connections between concept pairs and is computed as:
existence of a linkage between two concepts in the KG. Finally, SCS(c i , c j ) = 1 − 1 . β is a damping factor that

l =1 β
Íτ l
1+( |pathsc i ,c j |)
1 https://www.w3.org/TR/sparql11-property-paths/
penalize longer paths (in our case β = 0.5). NP and SCS consider
Knowledge graph-based weighting strategies KaRS’18, October 7, 2018, Vancouver, Canada.

the number of paths between the given concepts as indicative of a to their Graph Similarity (GS) to the user profile. For GS we employ
strong linkage. the edit distance implemented in [5] and defined in Equation 2.
Hierarchical Similarity (HS). We use the hierarchical informa-
tion of the concepts in the KG to calculate a measure of similarity GSnodes (G i , G j ) + GS edдes (G i , G j )
between the concepts. The higher the similarity, the stronger the GS(G i , G j ) = (2)
link between the concepts. If A is the set of categories for the con- Í2
α nodes + c ∈Ni ∩N j |w i (c) − w j (c)|
cept c i and B is the set of categories for the concept c j , HS is defined GSnodes (G i , G j ) = 1 −
α nodes + c ∈Ni ∩N j max(w i (c), w j (c))
Í
as:
α edдes + e ∈Ei ∩E j |w i (e) − w j (e)|
Í
GS edдes (G i , G j ) = 1 −
α edдes + e ∈Ei ∩E j max(w i (e), w j (e))
Í
HS(c i , c j ) = max taxsim(cati , cat j ) (1)
(cat i ∈A,cat j ∈B) where α nodes and α edдes are defined as:
δ (root, catlca ) Õ Õ Õ Õ
taxsim = α nodes = w i (c) + w j (c) − w i (c) − w j (c)
δ (cati , catlca ) + δ (cat j , catlca ) + δ (root, catlca )
c ∈N i c ∈N j c ∈N i ∩N j c ∈N i ∩N j
Given the hierarchy of categories T , the catlca of two categories
Õ Õ Õ Õ
α edдes = w i (e) + w j (e) − w i (e) − w j (e)
cati and cat j is the vertex of greatest depth in T that is the common
e ∈E i e ∈E j e ∈E i ∩E j e ∈E i ∩E j
ancestor of both cati and cat j . δ (cata , catb ) is the number of edges
on the shortest path between cata and catb . GS evaluates the similarity between two graphs in terms of the
weights differences of the common nodes/edges compared to the
5 NODE WEIGHTING FUNCTIONS total weight of the nodes/edges in the two graphs. Therefore, two
graphs are similar not only if their nodes/edges coincide but also if
Concept frequency (CF) + Discounts. Inspired by TF-IDF, CF their weights are close in magnitude.
analyzes the occurrences of the concept in the input content as
well as the frequency in the set of semantic representations. CF is 6.1 Dataset
defined as: CF (c) = wc f (c) × loд mMc , where wc f (c) represents the
We employ the dataset proposed in [6] that contains the user profiles
number of times that c appears in the input content, M is the total
of 11 professors in the area of computer science. The user profiles
number of documents/profiles in the dataset and mc is the number
were built using the full text of the most recent publications found
of documents/profiles with the concept c in their representation.
on their Google Scholar web pages. At least a minimum of twelve
After the expansion process, the representation could be diverted
of each professor’s most recent publications were used as input for
towards frequent properties in the set of instances of the KG or
the semantic representation process. The candidate set is a subset of
general categories in the hierarchical structure of the KG. For the
Core and Arxiv open corpora that contains 5710 different academic
categories added through the expansion the following discount is
1 1 documents (i.e. papers, tech reports, thesis, etc.). The ground truth
applied: CFdiscount (cat) = CF (cat) × loд(S P ) × loд(SC) where SP of papers is a subset of the candidate set in which users express
is the set of concepts belonging to the category and SC is the set an explicit interest via a web-based search system. In the data
of sub-categories in the category hierarchy. The idea behind this set, for each user there are at least 10 relevant documents. As for
discount strategy is that categories that are too broad and generic the user profile, the full text was used to construct the semantic
are penalizes [11]. Similarly, for property-based expanded concepts, representation of each document in the candidate set.
the following discount is applied: CFdiscount (c) = CF (c) × loд(P 1
) For the construction of the semantic representation we use: (i)
where P is the number of occurrences of the property in the KG DBpedia as KG, (ii) DBpedia Spotlight as annotation service, (iii) a
from which the concept c ∈ C is obtained. maximum path length of 2 for filtering and edge conformation as
PageRank (PR): PageRank is a well-known node ranking algo- well as for the τ parameter, (iv) a minimum degree value α = 1, (v)
rithm. We use the PageRank version for directed weighted graphs categories extracted through dct:subject to calculate HS.
(i.e. it considers the edge weights). Therefore, the results obtained
with this centrality measure depends on the resulting link structure 7 RESULTS
in the semantic graph and the associated edge weight. The performance of the recommender system was evaluated by
typical metrics for the evaluation of Top-N recommender tasks:
6 EXPERIMENTAL SETUP MRR (Mean Reciprocal Rank), MAP@10 (Mean Average Precision),
Our main goal is to analyze the influence of the different weighting and NDCG@10 (Normalized Discounted Cumulative Gain). We
strategies in the context of scholarly paper recommendation. We select N=10 as the recommendation objective since it is a common
compare the quality achieved by the same recommendation algo- rank and it fits with the minimum number of relevant documents
rithm when inputting semantic representations for user profiles and per user in the dataset.
documents using the different weighting strategies. In this regard, Table 1 presents the results obtained by the different combina-
we embrace the content-based algorithm described in Definition 2. tions of the weighting strategies. We use a classical content-based
Definition 2. Recommendation Algorithm: given a user profile u recommendation algorithm baseline that ranks candidate items
and a set of candidate scholarly papers SP = {p1 , ..., pn }, which are according to their cosine similarity with the user profile. In this
represented using the graph-based representation in Definition 1, case, profiles and documents use a standard bag-of-words Vector
the recommendation algorithm ranks the candidate items according Space Model (VSM) representation. According to Beel et al. VSM
KaRS’18, October 7, 2018, Vancouver, Canada. Rubén Manrique and Olga Marino

Table 1: Performance of scholarly paper recommendation to test this assumption by favoring/disfavoring the contribution of
using different weighting strategies for nodes and edges. (*) nodes and edges.
indicates the improvement over baselines is statistically sig-
nificant (p < 0.05). ACKNOWLEDGMENTS
This work was partially supported by COLCIENCIAS PhD scholar-
Edge Weight Node Weight MRR MAP NDCG ship (Call 647-2014).
NP CF 0.503 0.5* 0.661
NP PR 0.523* 0.401 0.482 REFERENCES
SCS CF 0.482 0.523* 0.716* [1] Joeran Beel, Bela Gipp, Stefan Langer, and Corinna Breitinger. 2016. Research-
SCS PR 0.574* 0.502* 0.672 paper recommender systems: a literature survey. International Journal on Digital
Libraries 17, 4 (2016), 305–338. https://doi.org/10.1007/s00799-015-0156-0
HS CF 0.421 0.395 0.495 [2] Tommaso Di Noia, Iván Cantador, and Vito Claudio Ostuni. 2014. Linked Open
HS PR 0.442 0.342 0.434 Data-Enabled Recommender Systems: ESWC 2014 Challenge on Book Recom-
mendation. In Semantic Web Evaluation Challenge, Valentina Presutti, Milan
Cosine Baseline 0.401 0.345 0.423 Stankovic, Erik Cambria, Iván Cantador, Angelo Di Iorio, Tommaso Di Noia,
VEO Baseline 0.442 0.427 0.601 Christoph Lange, Diego Reforgiato Recupero, and Anna Tordai (Eds.). Springer
International Publishing, Cham, 129–143.
[3] Tommaso Di Noia, Roberto Mirizzi, Vito Claudio Ostuni, Davide Romito, and
Markus Zanker. 2012. Linked Open Data to Support Content-based Recommender
Systems. In Proceedings of the 8th International Conference on Semantic Systems
with TF-IDF is the most frequent profiling and weighting scheme in (I-SEMANTICS ’12). ACM, New York, NY, USA, 1–8. https://doi.org/10.1145/
research paper recommender systems [1]. As a baseline, we also use 2362499.2362501
VEO (Vertex Edge Overlap [5]). VEO uses an unweighted version [4] Tommaso Di Noia and Vito Claudio Ostuni. 2015. Recommender Systems and
Linked Open Data. Springer International Publishing, Cham, 88–113. https:
of the semantic representation (Definition 1). The candidate items, //doi.org/10.1007/978-3-319-21768-0_4
in this case, are ranked based on the number of common nodes and [5] Rubén Manrique, Felipe Cueto, and Olga Mariño. 2018. Comparing Graph Sim-
ilarity Measures for Semantic Representations of Documents. In Advances in
edges they have with the user’s profile. VEO was chosen to evaluate Computing. Springer International Publishing, Cham, 3–16.
whether the consideration of weights in the representation has a [6] Rubén Manrique, Omar Herazo, and Olga Mariño. 2017. Exploring the Use of
positive impact on the recommendation. Linked Open Data for User Research Interest Modeling. In Advances in Computing,
Andrés Solano and Hugo Ordoñez (Eds.). Springer International Publishing, Cham,
We can see that the combination of SCS-CF outperforms other 3–16.
weighting strategies and the baselines in terms of MAP and NDCG. [7] Rubén Manrique and Olga Mariño. 2017. How Does the Size of a Document Affect
In terms of MRR, the best weighting strategy is SCS-PR. Although Linked Open Data User Modeling Strategies?. In Proceedings of the International
Conference on Web Intelligence (WI ’17). ACM, New York, NY, USA, 1246–1252.
SCS and NP consider the number of existing paths, the results show https://doi.org/10.1145/3106426.3109440
a better performance for SCS. This can be attributed to the fact [8] Cataldo Musto, Pierpaolo Basile, Pasquale Lops, Marco de Gemmis, and Giovanni
Semeraro. 2017. Introducing linked open data in graph-based recommender
that in addition to the number of paths, SCS also penalizes the systems. Information Processing & Management 53, 2 (2017), 405 – 435. https:
path length through a damping factor. Further experimentation is //doi.org/10.1016/j.ipm.2016.12.003
required to evaluate the effect of the β damping factor. HS presents [9] Cataldo Musto, Pasquale Lops, Pierpaolo Basile, Marco de Gemmis, and Giovanni
Semeraro. 2016. Semantics-aware Graph-based Recommender Systems Exploit-
the worst results among the weighting strategies and only slightly ing Linked Open Data. In Proceedings of the 2016 Conference on User Modeling
better than the cosine baseline. Adaptation and Personalization (UMAP ’16). ACM, New York, NY, USA, 229–237.
Regarding the node weighting strategies, results show that CF https://doi.org/10.1145/2930238.2930249
[10] Bernardo Pereira Nunes, Besnik Fetahu, Ricardo Kawase, Stefan Dietze, Marco An-
is more appropriate. CF is superior to PR in terms of MAP and tonio Casanova, and Diana Maynard. 2015. Interlinking Documents Based on
NDCG. On the other hand, according to the MRR, in average PR Semantic Graphs with an Application. Springer International Publishing, Cham,
139–155.
ranks higher the first relevant result. The comparison with VEO [11] Fabrizio Orlandi, John Breslin, and Alexandre Passant. 2012. Aggregated, Inter-
also shows that the consideration of weights for nodes and edges operable and Multi-domain User Profiles for the Social Web. In Proceedings of the
improves the semantic representation proposed for the task of 8th International Conference on Semantic Systems (I-SEMANTICS ’12). ACM, New
York, NY, USA, 41–48. https://doi.org/10.1145/2362499.2362506
recommending scholarly papers. [12] Heiko Paulheim. 2017. Knowledge graph refinement: A survey of approaches
and evaluation methods. Semantic Web 8, 3 (2017), 489–508. https://doi.org/10.
3233/SW-160218
8 CONCLUSIONS AND FUTURE WORK [13] Guangyuan Piao and John G. Breslin. 2016. Analyzing Aggregated Semantics-
In this paper, we explored different node and edge weighting strate- enabled User Modeling on Google+ and Twitter for Personalized Link Rec-
ommendations. In Proceedings of the 2016 Conference on User Modeling Adap-
gies for a graph-based semantic model for representing user profiles tation and Personalization (UMAP ’16). ACM, New York, NY, USA, 105–109.
and papers in the context of the scholarly paper recommendation https://doi.org/10.1145/2930238.2930278
task. The combination of SCS and CF as weighting strategies for [14] Guangyuan Piao and John G. Breslin. 2016. Exploring Dynamics and Semantics
of User Interests for User Modeling on Twitter for Link Recommendations. In
edges and nodes respectively presented the best results. SCS con- Proceedings of the 12th International Conference on Semantic Systems (SEMANTiCS
siders the number of existing paths in the KG between the concepts 2016). ACM, New York, NY, USA, 81–88. https://doi.org/10.1145/2993318.2993332
considered, while CF considers the frequency of the concept in
the document/profile. In the near future, we plan to explore other
measures of centrality for nodes weighting (e.g., betweenness, close-
ness). We also want to explore the effect of path lengths greater than
2 for the calculation of SCS/NP. Finally, according to our recommen-
dation algorithm and graph similarity measure (Equation 2), the
contribution of edges and nodes are considered equivalent. We want