Semantic interpretability of latent factors for recommendation∗
     Vito Walter Anelli, Tommaso Di Noia, Eugenio                                                                         Azzurra Ragone
               Di Sciascio, Claudio Pomo                                                                            Independent Researcher
                        Polytechnic University of Bari                                                                    Milan, Italy
                                  Bari, Italy                                                                     azzurra.ragone@gmail.com
                        firstname.lastname@poliba.it

ABSTRACT                                                                                      graph KG, we may build the set of all possible features as F =
Model-based approaches to recommendation have proven to be                                    {⟨ρ, ω⟩ | ⟨i, ρ, ω⟩ ∈ KG with i ∈ I }. Each item can be then rep-
                                                                                              resented as a vector of weights i = [v (i, ⟨ρ,ω ⟩1 ) , . . . , v (i, ⟨ρ,ω ⟩|F | ) ],
very accurate. Unfortunately, exploiting a latent space we miss
                                                                                              where v (i, ⟨ρ,ω ⟩) is the generic element computed as the normalized
references to the actual semantics of recommended items. In this
extended abstract, we show how to initialize latent factors in Fac-                           TF-IDF value for ⟨ρ, ω⟩. Since the numerator of T F K G can only take
                                                                                              values 0 or 1 and, each feature under the root in the denominator
torization Machines by using semantic features coming from a                                  has value 0 or 1, v (i, ⟨ρ,ω ⟩) is zero if ⟨ρ, ω⟩ < KG, and otherwise:
knowledge graph in order to train an interpretable model. Finally,
we introduce and evaluate the semantic accuracy and robustness                                                            log |I | − log | ⟨j, ρ, ω ⟩ ∩ K G |j ∈ I |
                                                                                                        v (i, ⟨ρ, ω ⟩) = r Í                                                                        (1)
for the knowledge-aware interpretability of the model.                                                                              | { ⟨ρ, ω ⟩ | ⟨i, ρ, ω ⟩ ∈ K G } |
                                                                                                                            ⟨ρ, ω ⟩∈F


1    INTRODUCTION                                                                             Analogously, when we have a set U of users, we may represent
                                                                                              them using the features describing the items they enjoyed in the
Transparency and interpretability of predictive models are gaining                            past. We use f to denote a feature ⟨ρ, ω⟩ ∈ F . Given a user u, if we
momentum since they have been recognized as a key element in the                              denote with I u the set of the items enjoyed by u, we may introduce
next generation of recommendation algorithms. When equipped                                   the vector u = [v (u, f1 ) . . . , v (u, f |F | ) ], where v (u, f ) is the generic
with interpretability of recommendation results, a system ceases to                           element computed as:
be just a black-box and users are more willing to extensively exploit                                                                           Í
                                                                                                                                                      v (i, f )
the predictions [6]. However, powerful and accurate Deep Learn-                                                      v (u, f ) =
                                                                                                                                               i ∈I u
                                                                                                                                                 u
                                                                                                                                    | {i | i ∈ I and v (i, f ) , 0} |
ing or model-based recommendation algorithms and techniques
project items and users in a new vector space of latent features thus                         Given the vectors uj , with j ∈ [1 . . . |U |], and ip , with p ∈ [1 . . . |I |],
making the final result not directly interpretable. In the last years,                        we build a matrix V ∈ Rn× |F | , where n = |U | + |I |: so the first
many approaches have been proposed that take advantage of side                                |U | rows have a one to one mapping with uj while the last ones
information to enhance the performance of latent factor models.                               correspond to ip . In second degree Factorization Machines models
Interestingly, in [7] the authors argue about a new generation of                             the score is computed as:
knowledge-aware recommendation engines able to exploit infor-                                                             n
                                                                                                                          Õ                 n Õ
                                                                                                                                            Õ n                      k
                                                                                                                                                                     Õ
mation encoded in knowledge graphs KG to produce meaningful                                            ŷ(xui ) = w 0 +         wj · xj +               x j · xp ·          v (j, f ) · v (p, f )   (2)
                                                                                                                          j=1               j=1 p=j+1                f =1
recommendations. In this work, we propose a knowledge-aware
Hybrid Factorization Machine (kaHFM) to train interpretable models                            We may see that, for each x, the term nj=1 p=j+1
                                                                                                                                      Í Ín
                                                                                                                                                   x j ·xp · kf =1 v (j, f ) ·
                                                                                                                                                            Í
in recommendation scenarios taking advantage of semantics-aware                               v (p, f ) is non-zero, i.e., when both x j and xp are equal to 1. In
information. kaHFM relies on Factorization Machines (F M) [4] and                             a recommendation scenario, this happens when there is an in-
it extends them in different key aspects by making use of the se-                             teraction between a user and an item. Moreover, the summation
mantic information encoded in a knowledge graph. We show how                                  Ík
                                                                                                       v     ·v      represents the dot product between two vectors:
kaHFM may exploit data coming from knowledge graphs as side                                      f =1 (j,f ) (p, f )
                                                                                              vj and vp with a size equal to k. Hence, vj represents a latent repre-
information to build a recommender system whose final results are
                                                                                              sentation of a user, vp that of an item within the same latent space,
accurate and, at the same time, semantically interpretable.
                                                                                              and their interaction is evaluated through their dot product.
                                                                                                  In order to inject the knowledge coming from KG into kaHFM,
2    KNOWLEDGE-AWARE HYBRID
                                                                                              we set k = |F | in Equation 2. In other words, we impose a number
     FACTORIZATION MACHINES                                                                   of latent factors equal to the number of features describing all the
In [1], the authors proposed to encode a Linked Data knowledge                                items in our catalog. Since we formulated our problem as a top-
graph in a Vector Space Model (V SM) to develop a Content Based
recommender system. Given a set of items I = {i 1 , i 2 , . . . , i N }                       N recommendation task, kaHFM can be trained using a learning
in a catalog and their associated triples ⟨i, ρ, ω⟩ in a knowledge                            to rank approach like Bayesian Personalized Ranking Criterion
                                                                                              (BPR)[5] obtaining V̂. We extract the items vectors vj from V̂, and
∗ An extended version of this work will be presented at the International Semantic Web
                                                                                              we use them to implement an Item-kNN recommendation approach.
Conference (ISWC 2019)[2]
                                                                                              We measure similarities between each pair of items i and j by
Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons          evaluating the cosine similarity of their corresponding vectors in
License Attribution 4.0 International (CC BY 4.0).
                                                                                              V̂. In an RDF knowledge graph, we usually find different types of
IIR 2019, September 16–18, 2019, Padova, Italy
                                                                                              encoded information. We extracted the categorical information that
                                                                                              is mainly used to state something about the subject of an entity.


                                                                                         43
IIR 2019, September 16–18, 2019, Padova, Italy                                                                                                             V.W. Anelli, et al.

                                                                                             Semantics Accuracy   SA@M      SA@2M     SA@3M      SA@4M       SA@5M      F.A.
3    EXPERIMENTAL EVALUATION                                                                 Yahoo!Movies           0.847     0.863     0.865      0.868       0.873    12.143
                                                                                                                    0.864     0.883     0.889      0.894       0.899    12.856
We evaluated the performance of our method on two well-known                                 Facebook Movies

datasets for recommender systems belonging to movies domain:                             Table 2: Semantics Accuracy results for different values of
Yahoo!Movies12 , and Facebook Movies3 . Experiments were con-                            M. F.A. denotes the Feature Average number per item.
                                                                                         Robustness. We suppose that a particular feature ⟨ρ, ω⟩ is useful
ducted adopting the "All Unrated Items" protocol, and an Hold-Out
                                                                                         to describe an item i but the corresponding triple ⟨i, ρ, ω⟩ is not
80-20 temporal split [3]. All the items from the datasets come with
                                                                                         represented in the knowledge graph. In case kaHFM was robust in
a DBpedia link. We retrieved all the ⟨ρ, ω⟩ pairs4 excluding some
                                                                                         generating weights for unknown features, it should discover the
noisy features (based on the following predicates): owl:sameAs,
                                                                                         importance of that feature and modify its value to make it enter
dbo:thumbnail, foaf:depiction, prov:wasDerivedFrom,
                                                                                         the Top-K features in vi . Starting from this observation, the idea
foaf:isPrimaryTopicOf.
                                                                                         to measure robustness is then to “forget” a triple involving i and
Accuracy Evaluation. The goal of this evaluation is to assess
                                                                                         check if kaHFM can generate it. Given a catalog I , we may then
if the controlled injection of Linked Data positively affects the
                                                                                         define the Robustness for 1 removed feature @M (1-Rob@M) as the
training of F M. We compared kaHFM5 w.r.t. a canonical 2 degree
                                                                                         number of items for which the removed feature is in Top − M after
F M optimized via BPR (BPR-FM). In order to preserve the expres-
                                                                                         training. Similarly to SA@nM, we may define 1-Rob@nM. Table 2
siveness of the model, we used the same number of hidden factors
                                                                                                 1-Robustness    1-Rob@M1-Rob@2M1-Rob@3M1-Rob@4M1-Rob@5M F.A.
as kaHFM. Since we use items similarity in the last step of our ap-                              Yahoo!Movies        0.487   0.645   0.713   0.756   0.793 12.143
proach, we compared kaHFM against an Attribute Based Item-kNN                                    Facebook Movies     0.821   0.945   0.970   0.980   0.984 12.856
(ABItem-kNN) algorithm, where each item is represented as a vector                       Table 3: 1-Robustness for different values of M. Column F.A.
of weights, computed through a TF-IDF model. We also compared                            denotes the Feature Average number per item.
kaHFM against Item-kNN, and User-kNN based on Cosine Similarity,                         showed that kaHFM was able to guess 10 on 12 different features
Most-Popular, and a knowledge-graph-based V SM adopting the                              for Yahoo!Movies. In this experiment, we remove one of the ten
representation formalted in [1]. To evaluate our approach, we mea-                       features (thus, based on Table 2, kaHFM will guess an average of
sured accuracy through Precision@N , and Normalized Discounted                           10 − 1 = 9 features). Since the number of features is 12 we have 3
Cumulative Gain (nDCG@N ). Table 1 shows the corresponding                               remaining "slots". In Table 3, we measure how often kaHFM is able
                                   Facebook Movies            Yahoo!Movies               to guess the removed feature in these "slots".
        Categorical Setting (CS)   Precision@10        Precision@10    nDCG@10
        ABItem-kNN                           0.0173∗         0.0421∗      0.1174∗
        BPR-FM                               0.0158∗         0.0189∗      0.0344∗        4      CONCLUSION AND FUTURE WORK
                                             0.0118∗         0.0154∗      0.0271∗
        MostPopular
        ItemKnn                              0.0262∗         0.0203∗      0.0427∗
                                                                                         We have proposed an interpretable method for recommendation
        UserKnn                              0.0168∗         0.0231∗      0.0474∗        scenario, kaHFM, in which we bind the meaning of latent factors for a
                                             0.0185∗         0.0385∗      0.1129∗
        VSM
        kaHFM                                 0.0296          0.0524       0.1399
                                                                                         Factorization machine to data coming from a knowledge graph. We
Table 1: Accuracy results for Facebook                     Movies, and                   considered Categorical information coming from DBpedia and we
Yahoo!Movies considering Top-10 recommendations, and a                                   have shown that the generated recommendations are more precise
relevance threshold of 4 over 5 stars.                                                   and personalized on two different publicly available datasets. We
results. We highlight in bold the best result while we underline the                     showed that the computed features are semantically meaningful,
second one. Statistically significant differences in performance are                     and the model is robust regarding computed features. In the future
denoted with a ∗ mark considering Student’s paired t-test with a                         we want to test the kaHFM performance in classical Information
0.05 level.                                                                              Retrieval, and knowledge graph completion tasks.
Semantic Accuracy. The main idea behind Semantic Accuracy is                             REFERENCES
to evaluate, given an item i, how well kaHFM is able to return its                       [1] Vito Walter Anelli, Tommaso Di Noia, Pasquale Lops, and Eugenio Di Sciascio.
original features available in the computed top-K list vi . In other                         2017. Feature Factorization for Top-N Recommendation: From Item Rating to
                                                                                             Features Relevance. In Proc. of the 1st Workshop on Intelligent Recommender Systems
words, subset i represented by F i = { f 1i , . . . , fmi , . . . f M
                                                                    i }, with
                                                                                             by Knowledge Transfer & Learning co-located with ACM Conf. on Recommender
F ⊆ F , we check if the values in vi , corresponding to fm,i ∈ F i ,
  i                                                                                          Systems (RecSys 2017), Como, Italy, August 27, 2017. (CEUR Workshop Proceedings),
                                                                                             Vol. 1887. CEUR-WS.org, 16–21.
are higher than those corresponding to f < F i . For the set of M                        [2] Vito Walter Anelli, Tommaso Di Noia, Eugenio Di Sciascio, Azzurra Ragone,
features initially describing i we see how many of them appear in                            and Joseph Trotta. 2019. How to make latent factors interpretable by feeding
the set top(vi , M) representing the top-M features in vi . We then                          Factorization machines with knowledge graphs. In The Semantic Web - ISWC 2019
normalize this number by the size of F i and average on all the
                                                                                             - 18th International Semantic Web Conference, Auckland, NZ, October 26-30, 2019.
                                                                                         [3] Vito Walter Anelli, Tommaso Di Noia, Eugenio Di Sciascio, Azzurra Ragone, and
items within the catalog I . Table 2 shows the results for SA@nM                             Joseph Trotta. 2019. Local Popularity and Time in top-N Recommendation. In
with n ∈ {1, 2, 3, 4, 5} and M = 10, and evaluated the number of                             Advances in Information Retrieval - 41st European Conference on IR Research, ECIR
                                                                                             2019, Cologne, Germany, April 14-18, 2019, Proceedings, Part I. 861–868.
ground features available in the top-nM elements of vi for each                          [4] Steffen Rendle. 2010. Factorization machines. In Data Mining (ICDM), 2010 IEEE
dataset.                                                                                     10th Int. Conf. on. IEEE, 995–1000.
                                                                                         [5] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme.
Generative Robustness. To check if kaHFM promotes important                                  2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In UAI 2009,
features for an item i we proposed a new measure: Generative                                 Proc. of the Twenty-Fifth Conf. on Uncertainty in Artificial Intelligence, Montreal,
                                                                                             QC, Canada, June 18-21, 2009. 452–461.
1 Yahoo! Webscope dataset ydata-ymovies-user-movie-ratings-content-v1_0
                                                                                         [6] Markus Zanker. 2012. The influence of knowledgeable explanations on users’
2 http://research.yahoo.com/Academic_Relations
                                                                                             perception of a recommender system. In Sixth ACM Conf. on Recommender Systems,
3 https://2015.eswc-conferences.org/program/semwebeval.html
                                                                                             RecSys ’12, Dublin, Ireland, September 9-13, 2012. 269–272.
4 https://github.com/sisinflab/LinkedDatasets/
                                                                                         [7] Yongfeng Zhang and Xu Chen. 2018. Explainable Recommendation: A Survey and
5 https://github.com/sisinflab/HybridFactorizationMachines/                                  New Perspectives. CoRR abs/1804.11192 (2018).


                                                                                    44