=Paper=
{{Paper
|id=Vol-1905/recsys2017_poster10
|storemode=property
|title=SemRevRec: A Recommender System based on User Reviews and Linked Data
|pdfUrl=https://ceur-ws.org/Vol-1905/recsys2017_poster10.pdf
|volume=Vol-1905
|authors=Iacopo Vagliano,Diego Monti,Maurizio Morisio
|dblpUrl=https://dblp.org/rec/conf/recsys/VaglianoMM17
}}
==SemRevRec: A Recommender System based on User Reviews and Linked Data==
SemRevRec: A Recommender System based on User Reviews and Linked Data Iacopo Vagliano Diego Monti Maurizio Morisio ZBW - Leibniz Information Centre for Politecnico di Torino Politecnico di Torino Economics, Kiel, Germany Turin, Italy Turin, Italy i.vagliano@zbw.eu diego.monti@polito.it maurizio.morisio@polito.it ABSTRACT entities and Linked Data, while the latter provides suggestions to Traditionally, recommender systems exploit user ratings to infer users. The two modules are disconnected: the recommendation preferences. However, the growing popularity of social platforms module works online, while the other works offline and provides has encouraged users to write textual reviews about liked items. the entities which can be recommended. Every time a new review These reviews represent a valuable source of non-trivial informa- is submitted, the system can repeat the semantic annotation and tion that could improve users’ decision processes. In this paper we discovery steps and possibly identify new entities. propose a novel recommendation approach based on the semantic Although our approach is not bounded to a particular domain, in annotation of entities mentioned in user reviews and on the knowl- our implementation, we exploited reviews from IMDb1 because we edge available in the Web of Data. We compared our recommender focused on movies. We chose DBpedia for annotation and discovery system with two baseline algorithms and a state-of-the-art Linked because it is one of the main datasets in the Web of Data. Data based approach. Our system provided more diverse recom- mendations with respect to the other techniques considered, while 2.1 Semantic Annotation and Discovery obtaining a better accuracy than the Linked Data based method. The semantic annotation technique associates a URI to the entities ACM Reference format: recognized in a given text to add information about their meaning. Iacopo Vagliano, Diego Monti, and Maurizio Morisio. 2017. SemRevRec: A In our case, the entities identified in the reviews are resources in the Recommender System based on User Reviews and Linked Data. In Proceed- Web of Data. Thus, the semantic annotation and discovery module ings of RecSys 2017 Posters, Como, Italy, August 27-31, 2 pages. can find other resources that are linked with the annotated entities, in order to enable our system to recommend more items. In our implementation, we relied on AIDA [3] to annotate re- 1 INTRODUCTION views with DBpedia resources. We exploited the DBpedia prop- Currently, most of recommender systems exploit user ratings to erties dbo:starring and dbo:director for discovering, through infer preferences, although the growing popularity of social and SPARQL queries, additional resources that are connected with the e-commerce websites has encouraged users to write textual reviews. annotated entities. The underling hypothesis is that most of the These reviews enable recommender systems to represent the multi- entities, if not movies, should be actors or directors. However, these faceted nature of users’ opinions and build a fine-grained preference properties can be configured according to the domain and the model, which cannot be obtained from overall ratings [2]. dataset considered. Given the annotated entities, the discoverer In this paper we describe how the information extracted from retrieves other relevant entities. This allows the system to discover user reviews, combined with Linked Data, can be exploited in rec- other movies from the same director or actor named in a given ommendation tasks. On one side, Linked Data can provide a rich review and significantly improve the accuracy of the recommenda- representation of the items to recommend since they include inter- tions. E.g., if Christopher Nolan was annotated in a review of The esting features. On the other side, reviews may reveal additional Dark Knight, Interstellar can be found because it is directed by him. connections among items: for instance, various reviews of Inter- The semantic annotation and discovery module stores both an- stellar mention Stanley Kubrick, although there is not a direct link notated and discovered entities. The URI of each annotated entity between these two resources in DBpedia. We propose a novel recom- is associated with the URI of the reviewed item and with the occur- mendation approach based on the semantic annotation of reviews rence of that entity in all the reviews of that item. The same entity to extract useful information from them. A preliminary offline study may, in fact, appear in reviews regarding different items. Similarly, suggests that our method provides better prediction and ranking the URI of each discovered entity is stored together with the URI accuracy than another recommender system based on Linked Data, of the annotated entity through which it was discovered and their while it increases the diversity of recommendations with respect to Linked Data Semantic Distance (LDSD) [5], a measure inversely all the techniques considered. proportional to the number of links between two resources. 2 APPROACH 2.2 Recommendation SemRevRec consists of two main modules: semantic annotation The recommendation process consists of two main steps: the gener- and discovery, and recommendation. The former is responsible for ation of the candidate recommendations and their ranking. Given feeding the recommendation module with semantically annotated an initial item, SemRevRec retrieves all the entities related to it. RecSys 2017 Poster Proceedings, August 27-31, Como, Italy 1 http://www.imdb.com Copyright held by the authors. RecSys 2017 Poster Proceedings, August 27-31, Como, Italy Iacopo Vagliano, Diego Monti, and Maurizio Morisio Firstly, the system selects the annotated entities which were men- Table 1: Results of the experiment tioned in the reviews of the initial item. Afterward, it obtains the entities which mention the initial item, i.e. entities whose reviews Algorithm Precis. Recall nDCG EBN Divers. generated an annotated entity that corresponds to the initial item. SemRevRec 0.0882 0.0459 0.0589 1.5671 0.1838 For example, if the initial item is Interstellar and a review of 2001: SPrank 0.0584 0.0327 0.0409 0.8244 0.1551 A Space Odyssey mention Interstellar, then 2001: A Space Odyssey is Popular 0.1325 0.0840 0.0969 2.7439 0.1412 considered as a candidate recommendation. Random 0.0055 0.0028 0.0031 0.3018 0.1679 Subsequently, SemRevRec retrieves the relevant discovered en- tities. These can be entities discovered through the initial item. For instance, if the initial item is Interstellar and The Dark Knight LambdaMart as ranking method and the properties related to the was previously discovered because both these movies have been di- movie domain (dct:subject, dbo:director, and dbo:starring). rected by Christopher Nolan, The Dark Knight is selected. Similarly, Table 1 lists the results of the experiment. For all the measures the entities discovered through other entities which were annotated but EBN, higher values mean better results, while the lower is EBN, in the reviews of the initial item are relevant. E.g., if Interstellar is the higher is the novelty. SemRevRec provided a better prediction ac- the initial item, Stanley Kubrick was annotated in one of its reviews, curacy and ranking than SPrank, while it improved in novelty with and 2001: A Space Odyssey was discovered through Stanley Kubrick, respect to the Most Popular technique. However, SPrank obtained then 2001: A Space Odyssey is a candidate recommendation. a higher novelty than SemRevRec. The diversity of the algorithms Finally, SemRevRec ranks the candidate recommendations. The was similar, but our system resulted in the best diversity. ranking function (Equation 1) considers the occurrence occurrencei of entities in the reviews and the Linked Data Semantic Distance 4 CONCLUSIONS AND FUTURE WORK (LDSD) between each discovered entity and the entity through In this paper we proposed a novel recommendation approach based which it was discovered. This avoids assigning the same value to on the semantic annotation of reviews to extract information as all the entities discovered through the same annotated entity. The Linked Data. Our method discovers additional resources and gen- item i can be an annotated or a discovered entity. The α coefficient erates recommendations by exploiting the annotated entities. A is 1 if the item i is an annotated entity, while it can be configured preliminary offline study conducted in the movie domain suggested to a custom value for the discovered entities (by default is 0.5). For that our algorithm provides better prediction accuracy and ranking the discovered entities, the occurrence of entities through which than another method based on Linked Data, while it increases the they were discovered is used, multiplied by α. To obtain a value diversity of recommendations with respect to the other techniques between 0 and 1, the occurrence is normalized with respect to the considered. Although we have tested our approach in only one do- maximum occurrence of entities j which belong to the candidate main, we could apply it to others, provided the reviews. As future recommendation set CR. The β coefficient is 1 if i is an annotated work, we plan to evaluate SemRevRec in other domains, such as entity, 0.5 otherwise. The γ coefficient is 0.5 for discovered entities, music and books, and also consider, during ranking, the sentiment 0 otherwise. In this way, the function returns a number between and the linking confidence associated with the annotated entities. 0 and 1, which is equal to the first term for the annotated entities, while, for the discovered entities, it represents the average of the ACKNOWLEDGMENTS first term and 1 − LDSD(i, io ), where io is the entity through which it was discovered. This work was supported by the EU’s Horizon 2020 programme α · occurrencei under grant agreement H2020-693092 MOVING. R(i) = β · + γ · (1 − LDSD(i, io )) (1) max j ∈CR (occurrencej ) REFERENCES 3 EVALUATION [1] Alejandro Bellogìn, Ivàn Cantador, and Pablo Castells. A Study of Heterogeneity in Recommendations for a Social Music Service. In Proceedings of the 1st Interna- We evaluated SemRevRec with a preliminary offline experiment tional Workshop on Information Heterogeneity and Fusion in Recommender Systems conducted in the movie domain. Its purpose is to compare our (2010) (HetRec ’10). ACM, 1–8. https://doi.org/10.1145/1869446.1869447 [2] Li Chen, Guanliang Chen, and Feng Wang. 2015. Recommender systems based proposal with a state-of-the-art recommender system based on on user reviews: The state of the art. User Modeling and User-Adapted Interaction Linked Data and two baseline algorithms. We annotated the reviews 25, 2 (2015), 99–154. https://doi.org/10.1007/s11257-015-9155-5 available on IMDb for the top-250 movies2 . We also relied on the [3] Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Man- fred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. MovieLens 1M dataset3 for obtaining the actual user ratings. 2011. Robust Disambiguation of Named Entities in Text. In Conference on Empiri- The evaluation was performed with LibRec4 . We executed a 5- cal Methods in Natural Language Processing, EMNLP 2011, Edinburgh, Scotland. 782–792. fold cross-validation considering as positive the ratings greater than [4] Tommaso Di Noia, Vito Claudio Ostuni, Paolo Tomeo, and Eugenio Di Sciascio. 3 on a scale from 1 to 5. Using the top-10 recommendations for each 2016. SPrank: Semantic Path-Based Ranking for Top-N Recommendations Using user, we computed the measures of precision, recall, nDCG, En- Linked Open Data. ACM Transactions on Intelligent Systems and Technology 8, 1 (2016), 9:1–9:34. https://doi.org/10.1145/2899005 tropy Based Novelty (EBN) [1], and diversity [6]. We compared our [5] Alexandre Passant. 2010. dbrec - Music Recommendations Using DBpedia. In technique with the Most Popular and the Random Guess baseline The Semantic Web - ISWC 2010. Springer Berlin Heidelberg, 209–224. algorithms, and with SPrank [4]. We configured SPrank to exploit [6] Mi Zhang and Neil Hurley. 2008. Avoiding Monotony: Improving the Di- versity of Recommendation Lists. In Proceedings of the 2008 ACM Conference 2 http://www.imdb.com/chart/top on Recommender Systems (RecSys ’08). ACM, New York, NY, USA, 123–130. 3 http://grouplens.org/datasets/movielens/1m/ https://doi.org/10.1145/1454008.1454030 4 https://www.librec.net