=Paper= {{Paper |id=Vol-1905/recsys2017_poster10 |storemode=property |title=SemRevRec: A Recommender System based on User Reviews and Linked Data |pdfUrl=https://ceur-ws.org/Vol-1905/recsys2017_poster10.pdf |volume=Vol-1905 |authors=Iacopo Vagliano,Diego Monti,Maurizio Morisio |dblpUrl=https://dblp.org/rec/conf/recsys/VaglianoMM17 }} ==SemRevRec: A Recommender System based on User Reviews and Linked Data== https://ceur-ws.org/Vol-1905/recsys2017_poster10.pdf
     SemRevRec: A Recommender System based on User Reviews
                       and Linked Data
                 Iacopo Vagliano                                Diego Monti                              Maurizio Morisio
    ZBW - Leibniz Information Centre for                    Politecnico di Torino                       Politecnico di Torino
        Economics, Kiel, Germany                                 Turin, Italy                                Turin, Italy
            i.vagliano@zbw.eu                               diego.monti@polito.it                     maurizio.morisio@polito.it

ABSTRACT                                                                 entities and Linked Data, while the latter provides suggestions to
Traditionally, recommender systems exploit user ratings to infer         users. The two modules are disconnected: the recommendation
preferences. However, the growing popularity of social platforms         module works online, while the other works offline and provides
has encouraged users to write textual reviews about liked items.         the entities which can be recommended. Every time a new review
These reviews represent a valuable source of non-trivial informa-        is submitted, the system can repeat the semantic annotation and
tion that could improve users’ decision processes. In this paper we      discovery steps and possibly identify new entities.
propose a novel recommendation approach based on the semantic               Although our approach is not bounded to a particular domain, in
annotation of entities mentioned in user reviews and on the knowl-       our implementation, we exploited reviews from IMDb1 because we
edge available in the Web of Data. We compared our recommender           focused on movies. We chose DBpedia for annotation and discovery
system with two baseline algorithms and a state-of-the-art Linked        because it is one of the main datasets in the Web of Data.
Data based approach. Our system provided more diverse recom-
mendations with respect to the other techniques considered, while        2.1    Semantic Annotation and Discovery
obtaining a better accuracy than the Linked Data based method.           The semantic annotation technique associates a URI to the entities
ACM Reference format:                                                    recognized in a given text to add information about their meaning.
Iacopo Vagliano, Diego Monti, and Maurizio Morisio. 2017. SemRevRec: A   In our case, the entities identified in the reviews are resources in the
Recommender System based on User Reviews and Linked Data. In Proceed-    Web of Data. Thus, the semantic annotation and discovery module
ings of RecSys 2017 Posters, Como, Italy, August 27-31, 2 pages.         can find other resources that are linked with the annotated entities,
                                                                         in order to enable our system to recommend more items.
                                                                            In our implementation, we relied on AIDA [3] to annotate re-
1    INTRODUCTION                                                        views with DBpedia resources. We exploited the DBpedia prop-
Currently, most of recommender systems exploit user ratings to           erties dbo:starring and dbo:director for discovering, through
infer preferences, although the growing popularity of social and         SPARQL queries, additional resources that are connected with the
e-commerce websites has encouraged users to write textual reviews.       annotated entities. The underling hypothesis is that most of the
These reviews enable recommender systems to represent the multi-         entities, if not movies, should be actors or directors. However, these
faceted nature of users’ opinions and build a fine-grained preference    properties can be configured according to the domain and the
model, which cannot be obtained from overall ratings [2].                dataset considered. Given the annotated entities, the discoverer
    In this paper we describe how the information extracted from         retrieves other relevant entities. This allows the system to discover
user reviews, combined with Linked Data, can be exploited in rec-        other movies from the same director or actor named in a given
ommendation tasks. On one side, Linked Data can provide a rich           review and significantly improve the accuracy of the recommenda-
representation of the items to recommend since they include inter-       tions. E.g., if Christopher Nolan was annotated in a review of The
esting features. On the other side, reviews may reveal additional        Dark Knight, Interstellar can be found because it is directed by him.
connections among items: for instance, various reviews of Inter-            The semantic annotation and discovery module stores both an-
stellar mention Stanley Kubrick, although there is not a direct link     notated and discovered entities. The URI of each annotated entity
between these two resources in DBpedia. We propose a novel recom-        is associated with the URI of the reviewed item and with the occur-
mendation approach based on the semantic annotation of reviews           rence of that entity in all the reviews of that item. The same entity
to extract useful information from them. A preliminary offline study     may, in fact, appear in reviews regarding different items. Similarly,
suggests that our method provides better prediction and ranking          the URI of each discovered entity is stored together with the URI
accuracy than another recommender system based on Linked Data,           of the annotated entity through which it was discovered and their
while it increases the diversity of recommendations with respect to      Linked Data Semantic Distance (LDSD) [5], a measure inversely
all the techniques considered.                                           proportional to the number of links between two resources.

2    APPROACH                                                            2.2    Recommendation
SemRevRec consists of two main modules: semantic annotation              The recommendation process consists of two main steps: the gener-
and discovery, and recommendation. The former is responsible for         ation of the candidate recommendations and their ranking. Given
feeding the recommendation module with semantically annotated            an initial item, SemRevRec retrieves all the entities related to it.
RecSys 2017 Poster Proceedings, August 27-31, Como, Italy
                                                                         1 http://www.imdb.com
Copyright held by the authors.
RecSys 2017 Poster Proceedings, August 27-31, Como, Italy                                   Iacopo Vagliano, Diego Monti, and Maurizio Morisio


    Firstly, the system selects the annotated entities which were men-                       Table 1: Results of the experiment
tioned in the reviews of the initial item. Afterward, it obtains the
entities which mention the initial item, i.e. entities whose reviews             Algorithm         Precis.     Recall      nDCG        EBN         Divers.
generated an annotated entity that corresponds to the initial item.
                                                                                 SemRevRec         0.0882      0.0459      0.0589      1.5671      0.1838
For example, if the initial item is Interstellar and a review of 2001:
                                                                                 SPrank            0.0584      0.0327      0.0409      0.8244      0.1551
A Space Odyssey mention Interstellar, then 2001: A Space Odyssey is
                                                                                 Popular           0.1325      0.0840      0.0969      2.7439      0.1412
considered as a candidate recommendation.
                                                                                 Random            0.0055      0.0028      0.0031      0.3018      0.1679
    Subsequently, SemRevRec retrieves the relevant discovered en-
tities. These can be entities discovered through the initial item.
For instance, if the initial item is Interstellar and The Dark Knight       LambdaMart as ranking method and the properties related to the
was previously discovered because both these movies have been di-           movie domain (dct:subject, dbo:director, and dbo:starring).
rected by Christopher Nolan, The Dark Knight is selected. Similarly,           Table 1 lists the results of the experiment. For all the measures
the entities discovered through other entities which were annotated         but EBN, higher values mean better results, while the lower is EBN,
in the reviews of the initial item are relevant. E.g., if Interstellar is   the higher is the novelty. SemRevRec provided a better prediction ac-
the initial item, Stanley Kubrick was annotated in one of its reviews,      curacy and ranking than SPrank, while it improved in novelty with
and 2001: A Space Odyssey was discovered through Stanley Kubrick,           respect to the Most Popular technique. However, SPrank obtained
then 2001: A Space Odyssey is a candidate recommendation.                   a higher novelty than SemRevRec. The diversity of the algorithms
    Finally, SemRevRec ranks the candidate recommendations. The             was similar, but our system resulted in the best diversity.
ranking function (Equation 1) considers the occurrence occurrencei
of entities in the reviews and the Linked Data Semantic Distance            4    CONCLUSIONS AND FUTURE WORK
(LDSD) between each discovered entity and the entity through
                                                                            In this paper we proposed a novel recommendation approach based
which it was discovered. This avoids assigning the same value to
                                                                            on the semantic annotation of reviews to extract information as
all the entities discovered through the same annotated entity. The
                                                                            Linked Data. Our method discovers additional resources and gen-
item i can be an annotated or a discovered entity. The α coefficient
                                                                            erates recommendations by exploiting the annotated entities. A
is 1 if the item i is an annotated entity, while it can be configured
                                                                            preliminary offline study conducted in the movie domain suggested
to a custom value for the discovered entities (by default is 0.5). For
                                                                            that our algorithm provides better prediction accuracy and ranking
the discovered entities, the occurrence of entities through which
                                                                            than another method based on Linked Data, while it increases the
they were discovered is used, multiplied by α. To obtain a value
                                                                            diversity of recommendations with respect to the other techniques
between 0 and 1, the occurrence is normalized with respect to the
                                                                            considered. Although we have tested our approach in only one do-
maximum occurrence of entities j which belong to the candidate
                                                                            main, we could apply it to others, provided the reviews. As future
recommendation set CR. The β coefficient is 1 if i is an annotated
                                                                            work, we plan to evaluate SemRevRec in other domains, such as
entity, 0.5 otherwise. The γ coefficient is 0.5 for discovered entities,
                                                                            music and books, and also consider, during ranking, the sentiment
0 otherwise. In this way, the function returns a number between
                                                                            and the linking confidence associated with the annotated entities.
0 and 1, which is equal to the first term for the annotated entities,
while, for the discovered entities, it represents the average of the
                                                                            ACKNOWLEDGMENTS
first term and 1 − LDSD(i, io ), where io is the entity through which
it was discovered.                                                          This work was supported by the EU’s Horizon 2020 programme
                       α · occurrencei                                      under grant agreement H2020-693092 MOVING.
        R(i) = β ·                          + γ · (1 − LDSD(i, io )) (1)
                   max j ∈CR (occurrencej )
                                                                            REFERENCES
3    EVALUATION                                                              [1] Alejandro Bellogìn, Ivàn Cantador, and Pablo Castells. A Study of Heterogeneity
                                                                                 in Recommendations for a Social Music Service. In Proceedings of the 1st Interna-
We evaluated SemRevRec with a preliminary offline experiment                     tional Workshop on Information Heterogeneity and Fusion in Recommender Systems
conducted in the movie domain. Its purpose is to compare our                     (2010) (HetRec ’10). ACM, 1–8. https://doi.org/10.1145/1869446.1869447
                                                                             [2] Li Chen, Guanliang Chen, and Feng Wang. 2015. Recommender systems based
proposal with a state-of-the-art recommender system based on                     on user reviews: The state of the art. User Modeling and User-Adapted Interaction
Linked Data and two baseline algorithms. We annotated the reviews                25, 2 (2015), 99–154. https://doi.org/10.1007/s11257-015-9155-5
available on IMDb for the top-250 movies2 . We also relied on the            [3] Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Man-
                                                                                 fred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum.
MovieLens 1M dataset3 for obtaining the actual user ratings.                     2011. Robust Disambiguation of Named Entities in Text. In Conference on Empiri-
   The evaluation was performed with LibRec4 . We executed a 5-                  cal Methods in Natural Language Processing, EMNLP 2011, Edinburgh, Scotland.
                                                                                 782–792.
fold cross-validation considering as positive the ratings greater than       [4] Tommaso Di Noia, Vito Claudio Ostuni, Paolo Tomeo, and Eugenio Di Sciascio.
3 on a scale from 1 to 5. Using the top-10 recommendations for each              2016. SPrank: Semantic Path-Based Ranking for Top-N Recommendations Using
user, we computed the measures of precision, recall, nDCG, En-                   Linked Open Data. ACM Transactions on Intelligent Systems and Technology 8, 1
                                                                                 (2016), 9:1–9:34. https://doi.org/10.1145/2899005
tropy Based Novelty (EBN) [1], and diversity [6]. We compared our            [5] Alexandre Passant. 2010. dbrec - Music Recommendations Using DBpedia. In
technique with the Most Popular and the Random Guess baseline                    The Semantic Web - ISWC 2010. Springer Berlin Heidelberg, 209–224.
algorithms, and with SPrank [4]. We configured SPrank to exploit             [6] Mi Zhang and Neil Hurley. 2008. Avoiding Monotony: Improving the Di-
                                                                                 versity of Recommendation Lists. In Proceedings of the 2008 ACM Conference
2 http://www.imdb.com/chart/top                                                  on Recommender Systems (RecSys ’08). ACM, New York, NY, USA, 123–130.
3 http://grouplens.org/datasets/movielens/1m/                                    https://doi.org/10.1145/1454008.1454030
4 https://www.librec.net