=Paper= {{Paper |id=Vol-1245/paper10 |storemode=property |title=Preference Mapping for Automated Recommendation of Product Attributes for Designing Marketing Content |pdfUrl=https://ceur-ws.org/Vol-1245/cbrecsys2014-paper10.pdf |volume=Vol-1245 |dblpUrl=https://dblp.org/rec/conf/recsys/SinhaR14 }} ==Preference Mapping for Automated Recommendation of Product Attributes for Designing Marketing Content== https://ceur-ws.org/Vol-1245/cbrecsys2014-paper10.pdf
     Preference Mapping for Automated Recommendation of
       Product Attributes for Designing Marketing Content

                                                         Moumita Sinha and Rishiraj Saha Roy
                                                                       Adobe Research Labs, India
                                                                        Bangalore, India - 560029.
                                                                 {mousinha, rroy}@adobe.com


ABSTRACT                                                                                       campaign to potential customers that will try to highlight
Identification of relevant product attributes is critical to the                               certain aspects or attributes of the model. This attribute
success of any marketing campaign. This task can be con-                                       recommendation problem is critical to the success of the
ceptualized as an attribute recommendation problem based                                       campaign. Focusing on features that do not appeal to users
on the product’s content or features, where the goal of a                                      can result in a loss of large amount of ad spend and potential
solution would be to automatically recommend relevant fea-                                     losses in product revenue for a manufacturer. In this paper,
tures to the marketer for highlighting in a campaign. In this                                  we address this challenge by proposing a principled tech-
research, we try to solve this problem by using preference                                     nique called preference mapping [6], used in a novel way to
mapping, a powerful technique for associating feature pref-                                    automate the process of product attribute recommendation.
erences with users. We perform preference mapping with                                            Related research. Alpert [1] presents one of the rela-
sentiment scores associated with product attributes mined                                      tively early works emphasizing the importance of identifying
from user reviews on the Web. As a result of this process, we                                  relevant product attributes, and compares the e↵ectiveness
are able to visualize a set of compared products and the ap-                                   of direct and indirect questioning techniques. Cropper et
propriateness of the attributes on the same two-dimensional                                    al. [3] finds that a linear hedonic price function performs
space, enabling us to easily recommend important features to                                   as well as a linear logit model in estimating consumer pref-
a marketer. Finally, we show that expert recommendations                                       erences for product attributes. But their analysis is based
or ratings for product features do not necessarily correlate                                   on simulations and does not draw connections between pre-
with preference maps based on user sentiments.                                                 ferred attributes and campaign design. Zhang and Liu [12]
                                                                                               try to identify product features that are associated with user
                                                                                               sentiment by analyzing the contextual text associated with
Categories and Subject Descriptors                                                             the mention of the product feature. While it could be mean-
Information retrieval [Retrieval tasks and goals]: Rec-                                        ingful to further scrutinize such attributes while designing
ommender systems                                                                               product campaigns, the authors do not propose any method
                                                                                               towards that end. Lehdonvirta [10] aims to discover prod-
General Terms                                                                                  uct attributes that are likely to drive purchase decisions for
                                                                                               virtual goods like online games and engaging activities on
Algorithms, Experimentation, Human factors
                                                                                               social media. However, the analysis presented by the author
                                                                                               is purely from a sociological perspective and the author does
Keywords                                                                                       not provide an algorithm for automating the above process.
Preference Mapping, Sentiment Scores, Product Attributes                                       Recommendation algorithms similar to collaborative filter-
                                                                                               ing have been used for designing campaigns, but they rely
1. INTRODUCTION                                                                                heavily on large amounts of existing customer preference
                                                                                               data available with the advertiser [11]. On a related note,
  Motivation. Product manufacturers are always faced
                                                                                               they are also known to have limitations such as data spar-
with the dilemma of identifying which attribute(s) of their
                                                                                               sity and model scalability, which leads to poor recommenda-
products they should highlight in their targeted marketing
                                                                                               tions [2]. We provide a method for associating products with
campaigns. For example, a digital camera has several defin-
                                                                                               their marketable attributes that relate to each other based
ing aspects like power of zoom, size of display and image size
                                                                                               on publicly available sources. Such data sources may become
in megapixels. A release of a new camera model by a man-
                                                                                               accessible much before the advertiser receives direct informa-
ufacturer like Nikon will usually be followed by a marketing
                                                                                               tion about customers’ preferences based on product view or
Permission to make digital or hard copies of all or part of this work for personal or
                                                                                               product purchase data. Preference mapping is an approach
classroom use is granted without fee provided that copies are not made or distributed          to identify customer preferences based on users’ surveys of
for profit or commercial advantage and that copies bear this notice and the full cita-         product attributes. Individual user di↵erences are not aver-
tion on the first page. Copyrights for components of this work owned by others than            aged, but are directly incorporated into the mapping model
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-           and play vital roles in the preference fitting process [5]. As
publish, to post on servers or to redistribute to lists, requires prior specific permission    of date, the technique has only been used for understanding
Copyright      2014 for
and/or a fee. Request       the individual
                       permissions               papers by the paper’s authors.
                                    from permissions@acm.org.
Copying     permitted
CBRecSys ’14,    Octoberfor  private
                         6, 2014,       andValley,
                                   Silicon   academic       purposes.
                                                    California, USA. This volume is
                                                                                               user preferences for diverse food items like lamb sausages [7],
published    and ACM
Copyright 2014     copyrighted   by its editors. ...$15.00.
                        978-1-4503-2257-7/14/07                                                lager beer [6] and vanilla ice cream [4]. We believe that this
http://dx.doi.org/10.1145/2600428.2609478.
CBRecSys      2014, October 6, 2014, Silicon Valley, CA, USA.




                                                                                          65
method has a far greater potential and can be readily ex-                 Functions V ar(·), E(·) and Cov(·) refer to the variance,
tended to unexplored application areas.                               expectation, and covariance functions, respectively, and the
   Approach. In this research, after specifying our product             j ’s represent the eigenvalues of the matrix X. These eigen-
and attribute set, we acquire sentiment scores of user reviews        values have the corresponding eigenvectors as 1 , 2 , . . . , p
that mention attributes for the products in our set. Fol-             (the number of eigenvectors is equal to the rank of the ma-
lowing this, we associate user sentiments with the attributes         trix X). Then the ith PC for each product is the weighted
mentioned in the reviews (instead of the product as a whole)          sum of the scores of the product across the attributes, the
and average them over reviewers who have written reviews              weights being obtained from the ith eigenvector. A biplot
concerning the attributes. We perform preference mapping              graph can be plotted for PC1 and PC2 with the weighted
on this processed dataset involving products, attributes and          scores of each of the products and the eigenvector values for
average sentiment scores and generate a biplot visualization          each attribute. The resultant graph provides an easily in-
that can be used for attribute recommendation. Finally,               terpretable visualization that shows how products compare
we compare our recommendations with expert opinion and                among each other based on customer reviews and the rela-
show that there is no perfect correlation with what experts           tive proximity of each attribute to their respective products
believe to be good features and what consumers like in a              with respect to associated positive user sentiment. Based
marketed product.                                                     on this multivariate visualization, marketing contents can
   Organization. The rest of this paper is organized as               be designed, highlighting favorable attributes for products.
follows. In Sec. 2, we describe our method of applying pref-          A schematic of the steps a marketer will undergo to utilize
erence mapping to this situation. Next, we describe our data          statistical analysis of social reviews to design product spe-
in Sec. 3 followed by experimental results and discussion in          cific marketing campaigns is shown in Figure 1. Relevant
Sec. 4. Finally, we summarize our contribution and provide            steps have been explained in this section. Specific details
directions for future work in Sec. 5.                                 about our dataset and experimental setup will be provided
                                                                      in the next section.
2. METHOD
   We analyze a set of products p and a set of product at-
tributes k. Customers who have bought these products of-
ten go to the product or retailer website to provide feedback
about the product in the form of textual reviews. Most
of these reviews generally contain mentions of product at-
tributes. Further, positive or negative sentiments usually
accompany the above mentions of the attributes. In our ap-
proach, we collect reviews where each sentence talks about
only one attribute. Appropriate anaphora resolution is per-
formed for review sentences when the attribute name is not
directly mentioned [8]. Each sentence in each review is then
assigned a sentiment score. Since each sentence mentions              Figure 1: A schematic of the steps in our use case:
exactly one attribute, the sentiment score associated with            The steps in green are part of the workflow, while
the sentence is assumed to be the score associated with the           those in blue are part of the proposed algorithm.
attribute. Note that the e↵ectiveness of our algorithm is
not a↵ected by the scale or range of this sentiment scoring.
Next, the scores are averaged over the reviewers for each
attribute for each product.
   A preference mapping is then performed with the reviewer-
                                                                      3.     DATASET
averaged scores of each of the various attributes for the dif-           We test our approach on a dataset consisting of 1309
ferent products. We now explain how this is performed. As             reviews related to four digital camera models (Canon G3,
the first step, sentiment scores for all the product attributes       Canon Powershot SD500, Canon S100, and Nikon Coolpix
are scaled to the same range so that variances are com-               4300), having a total of 13 distinct attributes. These at-
parable across attributes of each product. Consider X =               tributes (or features) that we analyzed are: flash, zoom,
(X1 , X2 , . . . , Xp )T as the matrix of the reviewer-averaged       battery, auto (quality of automatic mode), photo quality,
scores for the p products (say, di↵erent camera models) and           view (quality of view through the viewfinder), delay (delay
the k attributes (like battery life, size of display and shutter      between photos), look, start (startup speed), color, night
delay). Thus each Xi is a vector with its elements as Xij ,           (quality of night photos), lens and resolution. The reviews
which is the reviewer-averaged sentiment score for attribute          are pre-processed to identify mentions of camera attributes
j of product i. The principal component (PC) transfor-                within their texts. The 13 attributes are mentioned a total
mation of the feature vector X is the linear transformation           of 583 times in the product reviews that we collected.
Y = T (X µ) where µ = E(X) and ⌃ = V ar(X) =                   0
                                                                 .       Expert ratings. It is an interesting exercise to com-
The transformation is such that V ar(Y ) is maximized and             pare our attribute recommendation system to expert opin-
the following holds:                                                  ion. To this end, we went through popular digital camera
                                                                      review sites dcresource1 and imaging-resource2 for ex-
                        1     2   ...     p                           tracting expert ratings on the thirteen attributes for our
                                                                      1
where, V ar(Yj ) = j , j = 1, 2, . . . , p, E(Yj ) = 0 and                http://www.dcresource.com, Accessed 11 July ’14.
                                                                      2
Cov(Yj , Yi ) = 0 when i 6= j.                                            http://www.imaging-resource.com, Accessed 11 July ’14.




                                                                 65
                                                Average Sentiment Scores
                                                                                                        (PCA) is then performed on this matrix of camera-attribute
                                                                                                        pairs. The PC1 and PC2 for this example, cumulatively ex-
                                                                                                        plain 85% of the variability in the data. We then produce
                                                                                                        the biplot of the weighted scores of the products and the
                                                                                                        eigenvectors of each of the attributes, as shown in Figure 3.

                                                                                                                                                                                            Canon S100

                      delay   color                                                                                    2
                 flash
                                      battery
        function                                                                                                                                          color
                                        auto
                                                                                                                           Canon G3                     lens
          lens                                      Canon G3               Canon PowerShot SD500
                                        zoom                                                                           1

           look                                                                                                                                             battery
                                      view
              photo
               resolution     start                                                                                                       function
                                                                                                                                      resolution




                                                                                                                PC2
                                                                                                                       0
                                                                                                                                           start
                                                                                                                                           auto

                                                                                                                                                                                flash
                                                                                                                      −1
                                                                                                                                              photo     Nikon coolpix 4300
                                                                                                                                                view                     look
                                                                                                                                                    zoom delay
                                                                                                                      −2




                                                                                                                                                           Canon PowerShot SD500

                                                                                                                                      −2                      0                         2
                                                    Canon S100               Nikon coolpix 4300                                                              PC1




                                                                                                        Figure 3: (Color online) A biplot of the weighted
Figure 2: (Color online) Reviewer-averaged senti-                                                       scores of products and eigenvector attributes. At-
ment scores of attributes for our camera models.                                                        tributes are in red and product names are in gray.

four camera models. Since none of the popular camera re-                                                   This graph provides a lot of information for design of mar-
view sites provide direct numeric ratings for attributes, we                                            keting campaigns. First, in the graph, two attributes (in red)
mapped expert opinion to a score of 1 or 2 depending upon                                               that are pointing towards the same direction, are attributes
the comments provided. For example, comments containing                                                 that tend to be highly positively correlated. A product that
words like exceptional, excellent and good about an attribute                                           is in the same direction as an attribute, has a high value
were mapped to two, and weak and worst were assumed to                                                  for this attribute. Thus, from the graph, we can conclude
be a one rating. The data that we collected has been made                                               that attributes, which are closer and in the same direction
publicly available at http://goo.gl/v8BGj4.                                                             as a product, are the ones that should be recommended for
                                                                                                        highlighting in marketing content for that particular model.
                                                                                                        For example, Canon G3 and Canon S100 received high sen-
4. EXPERIMENTS AND RESULTS                                                                              timent scores on attributes like lens and color, while Nikon
   We assign a sentiment score to each sentence in each re-                                             Coolpix 4300 and Canon PowerShot SD500 received high
view in our dataset with the Alchemy API3 and transfer the                                              positive sentiments on low shutter delay and zoom quality.
score to the attribute mentioned in the sentence. The higher                                            Thus, for example, lens and color should be recommended
the magnitude of the score, the stronger is the strength of                                             for designing marketing content in the campaign for Canon
the associated sentiment. Following this, the positive and                                              G3, rather than the zoom.
negative sentiment scores of all the 52 (= 13 ⇥ 4) camera-                                                 Second, this methodology also helps to contrast compet-
attribute pairs were averaged together over all the reviewers                                           ing products simultaneously and provides competitive intel-
who mentioned the pair in his/her reviews, the neutral sen-                                             ligence to the marketer. Thus, based on the given set of
timents contributing zero to the sum. The missing observa-                                              consumers’ reviews, one can deduce that Nikon Coolpix 4300
tions are assumed to be neutral sentiments and hence the                                                and Canon PowerShot SD500 are similar with respect to the
scores in such cases are assumed to be zero. These average                                              attributes studied, as compared to Canon G3 and Canon
sentiments for each camera over all attributes are shown in                                             S100. For example, if Nikon Coolpix 4300 and Canon Pow-
a radial chart in Figure 2. As a specific example, the bat-                                             ershot SD500 are competing products, then it is meaninful
tery of the Canon S100 was mentioned in 13 reviews, with                                                to recommend only discriminatory features that add value
seven, one, and five review(s) showing positive, negative and                                           to a particular product for its campaign. It is more sensible
neutral scores respectively. While the numbers of positive                                              to recommend flash for Nikon Coolpix 4300 (more closer to
and negative mentions seem comparable, the average posi-                                                the model than Canon 500) than the zoom, which is approx-
tive and negative sentiment scores were found to be 1.3461                                              imately equidistant from the both the products.
and 0.3569 respectively, indicating that the strength of the                                               Analysis of expert opinion. From the data collected on
negative sentiment was not as strong as the positive senti-                                             expert comments (Sec. 3), we find that many of the discussed
ment. In our experiments, the two values were averaged to                                               attributes are rated as 2, which implies that these attributes
obtain 0.8515.                                                                                          are “excellent” or “good” (Table 1). We assume that high
   We now have a matrix with four rows (corresponding                                                   expert score is analogous to high positive sentiment.
to each camera model) and thirteen columns (correspond-                                                    Table 2 shows the Kendall-Tau rank correlation coeffi-
ing to each model attribute). The cells of this matrix are                                              cients between the preference mapping technique and the
the reviewer-averaged sentiment scores associated with each                                             plain average sentiment scores (which is the unweighted sum
camera and attribute pair. A principal component analysis                                               of the attributes as opposed to the weighted sum for each
                                                                                                        camera). For three cameras we have statistically significant
3
    http://www.alchemyapi.com                                                                           (at 0.05 level) correlation between the methods and a moder-




                                                                                                   66
                                                                        the potential customer and is likely to improve customer
Table 1: Proportion of Attributes Rated as Excel-                       satisfaction.
lent/Good and Poor.                                                        As future work, we would like to cluster products using at-
  Camera                           Excellent/Good          Poor         tribute sentiment scores as features and observe the correla-
  Canon G3                         0.385                   0.538        tion of the clustering output to the representation produced
  Canon S100                       0.615                   0.231        by our preference mapping technique. Also, the quality of
  Canon Powershot SD500            0.385                   0.538        the reviews can be improved by choosing relevant users by
  Nikon Coolpix 4300               0.615                   0.385        mapping them to specific customer segments. This can lead
                                                                        to better insights on the data and finer levels of control in
Expert ratings were not available for all the attributes. So the sum
of the values in a row may not add up to one                            the design of marketing content.

                                                                        Acknowledgements
Table 2: Correlation between ranks of the attributes
                                                                        We thank Ritwik Sinha from Adobe Research Labs India for
based on average sentiment scores and preference
                                                                        valuable inputs at various stages of this work.
mapping scores.
  Camera                           Kendall-Tau         p-Value          6.   REFERENCES
  Canon G3                         0.564               0.007             [1] M. I. Alpert. Identification of determinant attributes:
  Canon S100                       0.615               0.003                 A comparison of methods. Journal of Marketing
  Canon Powershot SD500            0.641               0.002                 Research, pages 184–191, 1971.
  Nikon Coolpix 4300               0.294               0.172             [2] Y. H. Cho, J. K. Kim, and S. H. Kim. A personalized
                                                                             recommender system based on web usage mining and
                                                                             decision tree induction. Expert Systems with
ate correlation for Nikon Coolpix 4300. This shows that our                  Applications, 23(3):329–342, 2002.
method has high correlation with the intuitive understand-               [3] M. L. Cropper, L. Deck, N. Kishor, and K. E.
ing of the importance of the attributes and helps in further                 McConnell. Valuing product attributes using single
refinement. We could not observe any direct relation be-                     market data: a comparison of hedonic and discrete
tween the predictions based on the preference mapping and                    choice approaches. The Review of economics and
the attributes highly rated by experts.                                      Statistics, pages 225–232, 1993.
                                                                         [4] L. Dooley, Y. S. Lee, and J. F. Meullenet. The
                                                                             application of check-all-that-apply (CATA) consumer
5. CONCLUSIONS AND FUTURE WORK                                               profiling to preference mapping of vanilla ice cream
   The preference mapping technique, as described by us in                   and its comparison to classical external preference
this research, recommends potentially “valuable” attributes                  mapping. Food quality and preference, 21(4):394–401,
of products to marketers for highlighting in a marketing                     2010.
campaign. Our method provides the marketer the ability                   [5] K. Greenho↵ and H. MacFie. Preference mapping in
to design marketing content that can potentially increase                    practice. In H. MacFie and D. Thomson, editors,
response rates. We have used sentiment scores for product                    Measurement of Food Preferences, pages 137–166.
attributes, extracted from review texts to identify product                  Springer US, 1994.
features to be highlighted in campaigns. By focusing on at-              [6] J. X. Guinard, B. Uotani, and P. Schlich. Internal and
tributes that are known to have received positive sentiments                 external mapping of preferences for commercial lager
of customers, the risk in the campaign is minimized. More-                   beers: comparison of hedonic ratings by consumers
over, the comparison with the experts’ comments suggests                     blind versus with knowledge of brand and price. Food
that sometimes, what customers value more about a prod-                      Quality and Preference, 12(4):243–255, 2001.
uct may be di↵erent from attributes that experts consider                [7] H. Helgesen, R. Solheim, and T. NÃes.
of high quality. So, designing marketing content taking into                                                        , Consumer
                                                                             preference mapping of dry fermented lamb sausages.
account what a large section of consumers show positive sen-                 Food Quality and Preference, 8(2):97–109, 1997.
timents towards may help in engaging more e↵ectively with
                                                                         [8] S. Lappin and H. J. Leass. An algorithm for
a larger section of the consumers. The sentiment score in
                                                                             pronominal anaphora resolution. Comput. Linguist.,
our research is a continuous variable and PCA has been used
                                                                             20(4):535–561, Dec. 1994.
to identify appropriate attributes that have high scores. If
                                                                         [9] S. Lê, J. Josse, F. Husson, et al. Factominer: an r
some or all the scores are categorical in nature, multi-factor
                                                                             package for multivariate analysis. Journal of statistical
analysis [9] is preferable over PCA. The proposed technol-
                                                                             software, 25(1):1–18, 2008.
ogy does not require large amounts of customer preference
data to be available internally with the advertiser (for ex-            [10] V. Lehdonvirta. Virtual item sales as a revenue model:
ample, customers who have viewed the same product or cus-                    identifying attributes that drive purchase decisions.
tomers who have bought the same product), from their own                     Electronic Commerce Research, pages 97–113, 2009.
sales and browsing patterns. Rather, we use reviews that                [11] G. Linden, B. Smith, and J. York. Amazon. com
directly reflect customer preferences. The reviews can be                    recommendations: Item-to-item collaborative filtering.
collected from any external source with consumers’ opinion.                  Internet Computing, IEEE, 7(1):76–80, 2003.
The other major strength of our approach is that it is more             [12] L. Zhang and B. Liu. Identifying noun product
likely to be positively viewed by the future customer. Such                  features that imply opinions. In HLT ’11, pages
an approach enables having an informed conversation with                     575–580, 2011.




                                                                   67