=Paper=
{{Paper
|id=Vol-1388/demo_paper9
|storemode=property
|title=Generating Personalised and Opinionated Review Summaries
|pdfUrl=https://ceur-ws.org/Vol-1388/demo_paper9.pdf
|volume=Vol-1388
|dblpUrl=https://dblp.org/rec/conf/um/MuhammadLRS15
}}
==Generating Personalised and Opinionated Review Summaries==
<pdf width="1500px">https://ceur-ws.org/Vol-1388/demo_paper9.pdf</pdf>
<pre>
       Generating Personalised and Opinionated
                 Review Summaries

      Khalil Muhammad, Aonghus Lawlor, Rachael Rafter, and Barry Smyth

                          Insight Centre for Data Analytics,
                 University College Dublin Belfied, Dublin 4, Ireland.
                    {firstname.lastname}@insight-centre.org
                          http://www.insight-centre.org/


        Abstract. This paper describes a novel approach for summarising user-
        generated reviews for the purpose of explaining recommendations. We
        demonstrate our approach using TripAdvisor reviews.


1     Introduction

Product reviews, that are written by real users, are now mainstream online. Sites
like Amazon and TripAdvisor have collected thousands of reviews for all manner
of products, and users are increasingly relying on these reviews make better
choices [1]. However, there are so many reviews, some of which are quite long,
and it is increasingly difficult for users to identify the relevant information for
their needs [2]. Recently researchers have begun to explore the potential of such
reviews in building recommender systems [3], identifying useful reviews [4], and
using related techniques to help users write better reviews in the first place [5].
    In this paper we explore how to exploit textual reviews to summarise product
experiences that can explain recommendations. In particular we describe how we
can profile a user based on the reviews that they have written, and identify prod-
uct features that matter to them. And we explain how we can represent products
in a similar fashion, by extracting opinions and sentiment information from their
reviews. We then describe an initial approach to generating personalised product
summaries for given user-product pairs on TripAdvisor data.


2     System Overview

This section describes our approach for generating personalised summaries that
are tailored to a user‘s preferences, and non-personalised summaries that reflect
the general opinion of users about a product.


2.1    Feature Extraction and Sentiment Classification

Inspired by the methods described in [6,7], we consider bi-gram features that con-
form to one of two part-of-speech (POS) co-location patterns: a noun preceded
2      Khalil Muhammad, Aonghus Lawlor, Rachael Rafter, Barry Smyth

by an adjective (AN ) or by a noun (NN ). Single noun features that frequently
co-occur (> 70% of the time) with sentiment words in the same sentence are
also considered [6].
    To evaluate the sentiment of a given feature Fi in sentence Sj of review Rk ,
we identify the closest sentiment word wmin to Fi in Sj ; Fi is labelled as neutral
if no sentiment words are present. In this context, sentiment words are those
contained in the sentiment lexicon [6]. Next we extract the opinion pattern: the
POS tags for wmin , Fi and any words that occur between them. After a pass
over all features, the frequency of occurrence of all patterns is noted. For valid
patterns (those which occur more than once) we assign sentiment to Fi based
on that of wmin in the sentiment lexicon; sentiment is reversed if Sj contains a
negation term within a 4-word distance of wmin . Features associated with invalid
patterns are labelled as neutral.


2.2   Dataset and Feature Representation

Our corpus is taken from the hotel review site TripAdvisor.com. It contains
226,110 unique reviews written by 150,352 unique users for 2,500 hotels in 6
different cities. We mined over 270,000 unique features mention around 4,129,265
times using the process described in Section 2.1.
    We propose two levels of features to represent users and hotels; these levels
are determined by their similarity to amenities predefined on the TripAdvisor
website. We obtain a set of base features, which are single-nouns and bi-gram
noun phrases from the set of amenities that TripAdvisor uses to described hotels
(e.g fitness centre and wheelchair access). We expect these features to be highly
meaningful and familiar to users. We hypothesise that there is an extended bag of
words that people use when talking about the same thing. For instance, a person
talking about ‘breakfast’ may use words like ‘orange juice’ or ‘buffet’. The key
thing to note is that these are related words, but not necessarily synonyms.
We apply k-Means to base features and their corresponding sentences to find
other co-occurring, related features. The k most co-occurring features are used
to enhance the representation of each base feature as expanded features.


2.3   User Preferences and Hotel Profiles

Users normally talk about the things that matter to them in reviews. Therefore
we assume that the preferences of each user consist of the features they mention
in reviews and the relative frequency at which they mention them, which may
indicate their relevance to the user. Hence we define the profile of a user as the
set of all features mentioned by the user in reviews. Each feature in the set is
tagged with the relative frequency with which it was mentioned in the user’s
reviews.
    Similarly we define a hotel profile as a set of features mentioned by users
about the hotel. Each feature in the set is tagged with its relative frequency and
average sentiment score (a value in the range [−1, +1]).
                Generating Personalised and Opinionated Review Summaries              3

2.4   Generating Summaries
To construct non-personalised summaries for a user-hotel pair, two hotel profiles
are built using the base and expanded features respectively. Each feature is
assigned a ranking score that is the product of its average sentiment and its
normalised frequency. When the non-neutral features in the hotel profile are
ranked by the ranking score, the top-n and bottom-n features form the pros and
cons parts of the explanation respectively.
    To generate a personalised summary for a user-hotel pair, two profiles are
built each for users and hotels using the base and expanded features respectively.
Each feature in the hotel profile is assigned a ranking score that is the product
of its average sentiment and its normalised frequency. The ranking score of each
feature in the hotel profile is updated by multiplying its original ranking score
with its normalised frequency in the user profile. When the non-neutral features
in the hotel profile are ranked by the updated ranking score, the top-n and
bottom-n features form the pros and cons parts of the explanation respectively.
In both explanation types, expanded features are mapped to their corresponding
base features that are familiar to the user.

2.5   Examples
Earlier in 2.4, we described how we can model a user and hotel profile using
base and expanded features. In Fig. 1 we show a fragment of the user and hotel
profile used to generate the example summary in Fig. 2.


  Fig. 1. Snippet of a hotel and a user profile showing base and expanded features


Fig. 2. An example of a personalised summary of hotel using profiles with base features.
   Fig. 2 shows a screenshot of a personalised summary generated for a user-
hotel pair. The pro features are highlighted in green, and the cons features in red.
4       Khalil Muhammad, Aonghus Lawlor, Rachael Rafter, Barry Smyth

We always present base features in the summaries regardless of how we choose
to model the user and hotel profile. This is to avoid having too fine a level of
granularity which might be unintuitive to users. Therefore ‘shuttle bus service’
is a pro feature of the hotel, ranked by the user’s preferences. The tooltips (see
(c) in Fig. 2) display expanded features in snippets of sentences from reviews
that are associated with the base feature in the summary. Here the reviewers
have discussed the ‘printer’ not working, and the closing hours of the ‘centre’ ;
these have been summarised to ‘business centre’

3    Conclusion
This paper presents a method for constructing personalised summaries of items
based on opinions from textual reviews. With TripAdvisor data we show how
the pros and cons of hotels can be explained to users using different feature
representations. Our technique focuses on those features that users write about
most frequently in their reviews. This forms the basis for prioritising features that
are likely to be of interest to the user compared to non-personalised explanations,
which focus on features that are commonplace for a hotel but may not be so
relevant for an individual user. This work builds on related work in the area of
opinion mining and recommender systems but considers a novel application in
the form of explanation generation.

References
1. Lee, J., Park, D.H., Han, I.: The different effects of online consumer reviews on
   consumers’ purchase intentions depending on trust in online shopping malls: An
   advertising perspective. Internet research 21 (2011) 187–206
2. O’Sullivan, D., Smyth, B., Wilson, D.C., McDonald, K., Smeaton, A.: Improving
   the quality of the personalized electronic program guide. User Modeling and User-
   Adapted Interaction 14 (2004) 5–36
3. Dong, R., Schaal, M., O’Mahony, M.P., McCarthy, K., Smyth, B.: Opinionated
   product recommendation. In: Case-Based Reasoning Research and Development.
   Volume 7969 of Lecture Notes in Computer Science. Springer Berlin Heidelberg
   (2013) 44–58
4. O’Mahony, M.P., Smyth, B.: Learning to recommend helpful hotel reviews. In:
   Proceedings of the 3rd ACM Conference on Recommender systems, ACM (2009)
   305–308
5. Dong, R., McCarthy, K., O’Mahony, M., Schaal, M., Smyth, B.: Towards an intelli-
   gent reviewer’s assistant: Recommending topics to help users to write better product
   reviews. In: Proceedings of the 2012 ACM International Conference on Intelligent
   User Interfaces, ACM (2012) 159–168
6. Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the
   Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data
   Mining. KDD ’04, New York, NY, USA, ACM (2004) 168–177
7. Moghaddam, S., Ester, M.: Opinion digger: An unsupervised opinion miner from
   unstructured product reviews. In: Proceedings of the 19th ACM International Con-
   ference on Information and Knowledge Management. CIKM ’10, New York, NY,
   USA, ACM (2010) 1825–1828

</pre>