Automatic Selection of Linked Open Data
             features in Graph-based Recommender Systems

                                 Cataldo Musto, Pierpaolo Basile, Marco de Gemmis,
                                Pasquale Lops, Giovanni Semeraro, Simone Rutigliano
                                             Università degli Studi di Bari "Aldo Moro" - Italy
                                                         name.surname@uniba.it

ABSTRACT                                                                calculations based on DBpedia properties. The use of DB-
In this paper we compare several techniques to automati-                pedia for similarity calculation is also the core of the work
cally feed a graph-based recommender system with features               presented by Musto et al. in [4]: in this paper user prefer-
extracted from the Linked Open Data (LOD) cloud. Specifi-               ences in music extracted from Facebook are used as input
cally, we investigated whether the integration of LOD-based             to find other relevant artists and to build a personalized
features can improve the e↵ectiveness of a graph-based rec-             music playlist. Recently, the use of LOD-based data sources
ommender system and to what extent the choice of the fea-               has been the core of the ESWC 2014 Recommender Systems
tures selection technique can influence the behavior of the             Challenge1 : in that setting, the best-performing approaches
algorithm by endogenously inducing a higher accuracy or a               [2] were based on ensembles of several widespread algorithms
higher diversity. The experimental evaluation showed a clear            running on diverse sets of features gathered from the LOD
correlation between the choice of the feature selection tech-           cloud. However, none of the above described work tackles
nique and the ability of the algorithm to maximize a specific           nor the issue of automatically selecting the best subset of
evaluation metric. Moreover, our algorithm fed with LOD-                LOD-based features, neither analyzes the impact of such se-
based features was able to overcome several state-of-the-art            lection techniques on di↵erent metrics as the diversity of the
baselines: this confirmed the e↵ectiveness of our approach              recommendations.
and suggested to further investigate this research line.                   To this end, in this paper we propose a methodology to au-
                                                                        tomatically feed a graph-based recommendation algorithm
                                                                        with features extracted from the LOD cloud. We focused
Keywords                                                                our attention on graph-based approaches since they use a
Recommender Systems, PageRank, Graphs, Linked Open                      uniform formalism to represent both collaborative features
Data, Feature Selection, Diversity                                      (connections between users and items, expressed through
                                                                        ratings) and LOD-based ones (connections between di↵erent
                                                                        items, expressed through RDF statements). As graph-based
1.    BACKGROUND                                                        algorithm we adopted PageRank with Priors [3]. Moreover,
   The Linked Open Data (LOD) cloud is a huge set of inter-             in this work we compared several techniques to automati-
connected RDF statements covering many topical domains,                 cally select the best subset of LOD-based features, with the
ranging from government and geographical data to struc-                 aim to investigate to what extent the choice of the feature
tured information about media (movies, books, etc.) and                 selection technique can influence the behavior of the algo-
life sciences. The typical entry point to all this plethora of          rithm and can endogenously lead to a higher accuracy or a
data is DBpedia [1], the RDF mapping of Wikipedia, which                higher diversity of the recommendations.
is commonly considered as the nucleus of the emerging Web                  The rest of the paper is organized as follows: the descrip-
of Data. Thanks to the wide-spread availability of this free            tion of our recommendation methodology is the core of Sec-
machine-readable knowledge, a big e↵ort is now spent to in-             tion 2, while the details of the experimental evaluation we
vestigate whether and how the data gathered from the LOD                carried out along with the discussion of the results are pro-
cloud can be exploited to improve intelligent and adaptive              vided in Section 3. Finally, Section 4 sketches conclusions
applications, such a Recommender System (RS).                           and future work.
   Recent attempts towards the exploitation of Linked Open
Data to build RSs are due to Passant [6], who proposed                  2.   METHODOLOGY
a music recommender system based on semantic similarity                    The main idea behind our graph-based model is to repre-
                                                                        sent users and items as nodes in a graph. Formally, given
                                                                        a set of users U = {u1 . . . un } and a set of items I =
                                                                        {i1 . . . um }, a graph G = hV, Ei is instantiated. Given that
                                                                        for each user and for each item a node is created, |V | =
                                                                        |U | + |I|. Next, an edge connecting a user ui with an item ij
                                                                        is created for each positive feedback expressed by that user,
                                                                        so E = {(ui , ij )|likes(ui , ij ) = true}.
                                                                        1
CBRecSys 2015, September 20, 2015, Vienna, Austria.                       http://2014.eswc-conferences.org/important-dates/call-
Copyright remains with the authors and/or original copyright holders.   RecSys
   Given this basic formulation, built on the ground of sim-       ever, as the number of extra nodes and extra edges grows,
ple collaborative 2 data points, each item i 2 I can be pro-       the computational load of the PageRank with Priors grows
vided with a relevance score. To calculate the relevance of        as well, so it necessary to identify the subset of the most
each item, we used a well-known variant of the PageRank            useful properties gathered from the LOD cloud and to in-
called PageRank with Priors [3]. Di↵erently from PageR-            vestigate to what extent (if any) each of them improves the
ank, which assigns an evenly distributed prior probability         accuracy of our recommendation strategy.
to each node ( N1 , where N is the number of nodes), PageR-           A very naive approach may be to manually select the most
ank with Priors adopts a non-uniform personalization vector        relevant LOD-based features, according to simple heuristics
assigning di↵erent weights to di↵erent nodes to get a bias         or to domain knowledge (e.g. properties as director, starring,
towards some nodes (specifically, the preferences of a specific    composer may be considered as relevant for the Movie do-
user). In our algorithm the probability was distributed by         main, whereas properties as runtime or country may be not).
defining a simple heuristics, set after a rough tuning: 80%        This basic approach has several drawbacks, since it requires
of the total weight is evenly distributed among items liked        a manual e↵ort, but it is also strictly domain-dependent.
by the users (0% assigned to items disliked by the users),            To avoid this, we employed features selection techniques
while 20% is evenly distributed among the remaining nodes.         to automatically select the most promising LOD-based fea-
Damping factor was set equal to 0.85, as in [5].                   tures. Formally, our idea is to take as input ELOD , the over-
   Given this setting, the PageRank with Priors is executed        all set of LOD-based properties, and to produce as output
for each user (this is mandatory, since the prior probabilities    ELOD F S T ✓ ELOD , the set of properties a specific feature
change according to user’s feedbacks), and nodes are ranked        selection technique T returned as relevant. Clearly, the ex-
according to their PageRank score which is in turn calcu-          ploitation of a feature selection technique T also produces
lated on the ground of the connectivity in the graph. The          a set VLOD F S T ✓ VLOD , containing all the LOD-based
output of the PageRank is a list of nodes ranked according to      nodes connected to the properties in ELOD F S T .
PageRank scores, labeled as L. Given L, recommendations               In this setting, given a FS technique T , PageRank will be
are built by extracting from L only those nodes i1 . . . in 2 I.   executed against the graph GLOD T = hVLOD T , ELOD T i,
                                                                   where VLOD T = V [ VLOD F S T and ELOD T = E [
2.1    Introducing LOD-based features                              ELOD F S T . In the experimental session the e↵ectiveness of
   As stated above, our basic formulation does not take into       seven di↵erent techniques for automatic selection of LOD-
account any data point di↵erent from users’ ratings. The in-       based features: PageRank, Principal Component Analysis
sight behind this work is to enrich the above described graph      (PCA), Support Vector Machines (SVM), Chi-Squared Test
by introducing some extra nodes and some extra edges, de-          (CHI), Information Gain (IG), Information Gain Ratio (GR)
fined on the ground of the information available in the LOD        and Mininum Redundancy Maximum Relevance (MRMR)
cloud. Formally, we want to define an extended graph GLOD =        [7]. Clearly, a complete description of these techniques is
hVLOD ALL , ELOD ALL i, where VLOD ALL = V [ VLOD                  out of the scope of this paper. We will just limit to evaluate
and ELOD ALL = E [ ELOD . VLOD and ELOD represent                  their impact on the overall accuracy and the overall diversity
the extra nodes and the extra edges instantiated by analyz-        obtained by our algorithm.
ing the data gathered from the LOD cloud, respectively.
   As an example, if we consider the movie The Matrix, the         3.     EXPERIMENTAL EVALUATION
property http://dbpedia.org/property/director encod-
                                                                      Our experiments were designed on the ground of four dif-
ing the information about the director of the movie is avail-
                                                                   ferent research questions:
able in the LOD cloud. Consequently, an extra node The
Wachowski Brothers is added in VLOD and an extra edge, la-              1. Do graph-based recommender systems benefit of the
beled with the name of the property, is instantiated in ELOD               introduction of LOD-based features?
to connect the movie with its director. Similarly, if we con-
sider the property http://dbpedia.org/property/starring,                2. Do graph-based recommender systems exploiting LOD
new nodes and new edges are defined, in order to model                     features benefit of the adoption of FS techniques?
the relationship between The Matrix and the main actors,
                                                                        3. Is there any correlation between the choice of the FS
as Keanu Reeves, for example. In turn, given that Keanu
                                                                           technique and the behavior of the algorithm?
Reeves acted in several movies, many new edges are added
in the graph and many new paths now connect di↵erent                    4. How does our methodology perform with respect to
movies: these paths would not have been available if the                   state-of-the-art techniques?
only collaborative data points were instantiated.
   It immediately emerges that, due to this novel enriched            Experimental design: experiments were performed by
representation, the structure of the graph tremendously changes    exploiting MovieLens3 dataset, consisting of 100,000 ratings
since many new nodes and many new edges are added to the           provided by 943 users on 1,682 movies. The dataset is pos-
model: the first goal of our experimental session will be to       itively balanced (55.17% of positive ratings) and shows an
investigate whether graph-based RSs can benefit of the in-         high sparsity (93.69%). Each user voted 84.83 items on av-
troduction of novel LOD-based features.                            erage and each item was voted by 48.48 users, on average.
                                                                      Experiments were performed by carrying out a 5-folds
2.2    Selecting LOD-based features                                cross validation. Given that MovieLens preferences are ex-
 Thanks to the data points available in the LOD cloud,             pressed on a 5-point discrete scale, we decided to consider as
many new information can be encoded in our graph. How-             positive ratings only those equal to 4 and 5. As recommenda-
2                                                                  tion algorithms we used the previously described PageRank
  We just modeled user-items couples, as in collaborative fil-
                                                                   3
tering algorithms                                                      http://grouplens.org/datasets/movielens/
with Priors, set as explained in Section 2. We compared the      metrics. Another interesting outcome which follows the use
e↵ectiveness of our graph-based recommendation methodol-         of FS techniques is the saving of computational resources to
ogy by considering three di↵erent graph topologies: G, mod-      run PageRank with Priors on graph GLOD P CA . As shown
eling the basics collaborative information about user ratings;   in Table 1, the adoption of FS caused a huge decrease of the
GLOD , which enrichs G by introducing LOD-based features         run time of the algorithms equal to 33.9% for MovieLens
gathered from DBpedia, and GLOD T which lighten the load         (from 880 to 581 minutes). This is due to the smaller num-
of PageRank with Priors by relying on the features selected      ber of information which are modeled in the graphs (-8.6%
by a FS technique T . In order to enrich the graph G, each       nodes and -4.8% edges).
item in the dataset was mapped to a DBpedia entry. In our
experiments 1,600 MovieLens entries (95.06% of the movies)                                         MovieLens
were successfully mapped to a DBpedia node. The items for                                   G       GLOD   GLOD+P CA
which a DBpedia entry was not found were only represented                   F1@5         0.5406    0.5424  0.5424(*)
by using collaborative data points. Overall, MovieLens en-                  F1@10        0.6068    0.6083  0.6088(*)
tries were described through 60 di↵erent DBpedia proper-                    F1@15        0.5956    0.5963  0.5970(*)
ties. As feature selection techniques all the approaches pre-               F1@20        0.5678    0.5686  0.5689(*)
viously mentioned were employed, while for the parameter
                                                                         Run (min.)        72        880       581
K (the number of LOD-based features) three di↵erent val-
ues were compared: 10, 30 and 50. The performance of                    K (LOD prop.)       0         60        50
each graph topology was evaluated in terms of F1-measure.                   Nodes         2,625     53,794    49,158
Moreover, we also calculated the overall running time4 of                   Edges       100,000    178,020   169,405
each experiment. To answer the third research questions we
also evaluated the diversity of the recommendations, calcu-      Table 1: Overall comparison among the baseline, the com-
lated by exploiting the classical Intra-List Diversity (ILD).    plete LOD-based graph and the LOD-based graph boosted
Statistical significance was assessed by exploiting Wilcoxon     by PCA. The configurations overcoming the baseline were
and Friedman tests.                                              highlighted in bold. The best-performing configuration is
   Discussion of the Results: in the first experiment we         further highlighted with (*)
evaluated the introduction of LOD-based features in graph-                                    .
based recommender systems. Results are depicted in the
first two columns of Table 1. As regards MovieLens, a sta-          In Experiment 3 we shifted the attention from F1-measure
tistically significant improvement (p << 0.0001, assessed        to di↵erent evaluation metrics, and we investigate whether
through a Wilcoxon test) was obtained for all the metrics.       the adoption of a specific FS technique can endogenously
As expected, the expansion of the graph caused an expo-          induce a higher diversity at the expense of a little F1. Re-
nential growth of the run time of the algorithm. This is due     sults of the experiments are provided in Figure 1a. Due to
to the fact that the expansion stage introduced many new         space reasons, only the results for F1@10 are provided. In
nodes and many new edges in the graph (see Table 1). The         both charts we used four di↵erent symbols to identify the
growth is particularly significant since 50,000 new nodes and    di↵erent behaviors of each technique. It emerged that CHI
78,000 new edges were added to the graph.                        was the less useful technique, since it did not provide any
   Next, we evaluated the impact of all the previously pre-      significant benefit to neither F1 nor diversity. Next, PageR-
sented feature selection techniques in such recommendation       ank provided a (small) improvement on F1 and it did not
setting. By analyzing the results provided in Table 2, it        significantly change the diversity of the recommendations.
emerged that our graph-based recommendation strategy does        It is noteworth that the larger increase in accuracy of PCA
not often benefit of the application of FS techniques. Indeed,   is balanced by a decrease in terms of diversity. On the other
when a very small number of properties is chosen (K=10 ),        side, Gain Ratio obtained the overall best diversity of the
none of the configurations is able to overcome the baseline.     recommendation but it decreases the F1 of the algorithm.
By slightly increasing the value of parameter K (K=30 ),         To sum up, these results show that the choice of a particular
only three out of seven techniques (PageRank, PCA and            FS technique has a significant impact on the overall behavior
mRMR) improve the F1-measure. Next, when more data               of the recommendation algorithm. As shown in the experi-
points are introduced (with K=50 ) better results are ob-        ment, some techniques have the ability of inducing a higher
tained and the F1-measure of the baseline is always over-        diversity (or F1) at the expense of a little of F1 (or diversity,
came. Given that the overall number of LOD-based proper-         respectively), wheres other can provide a good compromise
ties was equal to 60, it is possible to state that most of the   between both metrics. Clearly, more investigation is needed
properties encoded in the extended graph GLOD can be con-        to deeply analyze the behavior of each technique, but these
sidered as relevant. Clearly, this is a very domain-specific     results already give some general guidelines which can drive
outcome, which needs to be confirmed by more thorough            the choice of the FS technique which best fits the require-
analysis on di↵erent datasets. However, it is possible to        ments of a specific recommendation scenario.
state that the adoption of FS techniques requires a complete        Finally, we compared the e↵ectiveness of our methodol-
analysis of the usefulness of each of the properties encoded     ogy to the current state of the art. As baselines User-to-
in the LOD. Overall, the best performing configuration was       User CF (U2U-KNN), Item-to-Item CF (I2I-KNN), a sim-
PCA, which was the only technique always overcoming the          ple popularity-based approach, a random baseline and the
baseline with K = 50. A Friedman test also showed that           Bayesian Personalized Ranking Matrix Factorization (BPRMF)
PCA statistically overcomes the other techniques for all the     were used. We adopted the implementations available in
4
                                                                 MyMediaLite Recommender System library5 . As regards
  Experiments were run on an Intel-i7-3770 CPU3.40 gHZ,
                                                                 5
with 32GB RAM.                                                       http://www.mymedialite.net/
  MovieLens          #feat.    PR            PCA                 SVM         CHI          IG           GR            mRMR
  F1@5                10       0.5418        0.5406              0.5382      0.5414       0.5397       0.5372        0.5397
  GLOD = 0.5424       30       0.5429(⇤)     0.5413              0.5413      0.5419       0.5396       0.5398        0.5429(⇤)
                      50       0.5412        0.5431(⇤)(")        0.5421(⇤)   0.5420(⇤)    0.5412(⇤)    0.5406(⇤)     0.5421
  F1@10               10       0.6069        0.6045              0.6043      0.6056       0.6039       0.6033        0.6039
  GLOD = 0.6083       30       0.6084(⇤)     0.6081              0.6074      0.6070       0.6055       0.6059        0.6072(⇤)
                      50       0.6070        0.6088(⇤)(")        0.6081(⇤)   0.6079(⇤)    0.6072(⇤)    0.6078(⇤)     0.6077
  F1@15               10       0.5964        0.5948              0.5943      0.5955       0.5950       0.5938        0.5950
  GLOD = 0.5963       30       0.5967(⇤)     0.5967              0.5964      0.5967       0.5955       0.5960        0.5961
                      50       0.5955        0.5970(⇤)(")        0.5966(⇤)   0.5972(⇤)    0.5962(⇤)    0.5968(⇤)     0.5962(⇤)
  F1@20               10       0.5684(⇤)     0.5667              0.5666      0.5672       0.5668       0.5666        0.5668
  GLOD = 0.5686       30       0.5684        0.5688              0.5679      0.5679       0.5675       0.5675        0.5679
                      50       0.5682        0.5689(⇤)(")        0.5683(⇤)   0.5686(⇤)    0.5685(⇤)    0.5687(⇤)     0.5685(⇤)

Table 2: Experiment 2. The configurations overcoming the baseline GLOD are emphasized in bold. Next, for each technique,
the number of features which led to the highest F1 is indicated with (⇤). The overall highest F1 score for each metric is
highlighted with (⇤)("). The column of the feature selection technique which performed the best on a specific dataset is
coloured in grey.


U2U and I2I, neighborhood size was set to 80, while BPRMF
was run by setting the factor parameter equal to 100. Re-
sults are depicted in Figure 1b. As shown in the plots, our
graph-based RS outperforms all the baselines for all the met-
rics taken into account. It is worth to note that our ap-
proach obtained a higher F1 also when compared to a well-
perfoming matrix factorization algorithm as BPRMF, thus
this definitely confirmed the e↵ectiveness of our approach.

4.   CONCLUSIONS AND FUTURE WORK
   In this work we proposed a graph-based recommendation
methodology based on PageRank with Priors, and we evalu-                     (a) Trade-o↵ between F1 and Diversity
ated di↵erent techniques to automatically feed such a repre-
sentation with features extracted from the LOD cloud. Re-
sults showed that graph-based RSs can benefit of the in-
fusion of novel knowledge coming from the LOD cloud and
that a clear correlation between the adoption of a specific FS
technique with the overall results of the recommender exitss,
since some techniques endogenously showed the ability of in-
creasing also the diversity of the recommendations generated
by the algorithm. We also showed that our methodology was
able to overcome several state-of-the-art baselines on both
datasets. As future work, we will validate the approach by
evaluating it on di↵erent dataset, and we will investigate
the impact of LOD-based features with di↵erent learning
approaches as Random Forest or SVM.                                               (b) Comparisons to baselines

5.   REFERENCES                                                                 Figure 1: Results of the experiments
[1] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann,
    R. Cyganiak, and Z. Ives. DBpedia: A nucleus for a
    Web of Open Data. Springer, 2007.                                [5] L. Page, S. Brin, R. Motwani, and T. Winograd. The
                                                                         pageRank citation ranking: bringing order to the web.
[2] P. Basile, C. Musto, M. de Gemmis, P. Lops,
                                                                         1999.
    F. Narducci, and G. Semeraro. Aggregation strategies
    for linked open data-enabled recommender systems. In             [6] A. Passant. dbrec - Music Recommendations Using
    European Semantic Web Conference, 2014.                              DBpedia. In International Semantic Web Conference,
                                                                         Revised Papers, volume 6497 of LNCS, pages 209–224.
[3] T. H. Haveliwala. Topic-Sensitive PageRank: A
                                                                         Springer, 2010.
    Context-Sensitive Ranking Algorithm for Web Search.
    IEEE Trans. Knowl. Data Eng., 15(4):784–796, 2003.               [7] H. Peng, F. Long, and C. Ding. Feature selection based
                                                                         on mutual information criteria of max-dependency,
[4] C. Musto, G. Semeraro, P. Lops, M. de Gemmis, and
                                                                         max-relevance, and min-redundancy. Pattern Analysis
    F. Narducci. Leveraging social media sources to
                                                                         and Machine Intelligence, IEEE Transactions on,
    generate personalized music playlists. In EC-Web 2012,
                                                                         27(8):1226–1238, 2005.
    volume 123 of LNBIP, pages 112–123. Springer, 2012.