=Paper=
{{Paper
|id=Vol-1245/paper10
|storemode=property
|title=Preference Mapping for Automated Recommendation of Product Attributes for Designing Marketing Content
|pdfUrl=https://ceur-ws.org/Vol-1245/cbrecsys2014-paper10.pdf
|volume=Vol-1245
|dblpUrl=https://dblp.org/rec/conf/recsys/SinhaR14
}}
==Preference Mapping for Automated Recommendation of Product Attributes for Designing Marketing Content==
Preference Mapping for Automated Recommendation of
Product Attributes for Designing Marketing Content
Moumita Sinha and Rishiraj Saha Roy
Adobe Research Labs, India
Bangalore, India - 560029.
{mousinha, rroy}@adobe.com
ABSTRACT campaign to potential customers that will try to highlight
Identification of relevant product attributes is critical to the certain aspects or attributes of the model. This attribute
success of any marketing campaign. This task can be con- recommendation problem is critical to the success of the
ceptualized as an attribute recommendation problem based campaign. Focusing on features that do not appeal to users
on the product’s content or features, where the goal of a can result in a loss of large amount of ad spend and potential
solution would be to automatically recommend relevant fea- losses in product revenue for a manufacturer. In this paper,
tures to the marketer for highlighting in a campaign. In this we address this challenge by proposing a principled tech-
research, we try to solve this problem by using preference nique called preference mapping [6], used in a novel way to
mapping, a powerful technique for associating feature pref- automate the process of product attribute recommendation.
erences with users. We perform preference mapping with Related research. Alpert [1] presents one of the rela-
sentiment scores associated with product attributes mined tively early works emphasizing the importance of identifying
from user reviews on the Web. As a result of this process, we relevant product attributes, and compares the e↵ectiveness
are able to visualize a set of compared products and the ap- of direct and indirect questioning techniques. Cropper et
propriateness of the attributes on the same two-dimensional al. [3] finds that a linear hedonic price function performs
space, enabling us to easily recommend important features to as well as a linear logit model in estimating consumer pref-
a marketer. Finally, we show that expert recommendations erences for product attributes. But their analysis is based
or ratings for product features do not necessarily correlate on simulations and does not draw connections between pre-
with preference maps based on user sentiments. ferred attributes and campaign design. Zhang and Liu [12]
try to identify product features that are associated with user
sentiment by analyzing the contextual text associated with
Categories and Subject Descriptors the mention of the product feature. While it could be mean-
Information retrieval [Retrieval tasks and goals]: Rec- ingful to further scrutinize such attributes while designing
ommender systems product campaigns, the authors do not propose any method
towards that end. Lehdonvirta [10] aims to discover prod-
General Terms uct attributes that are likely to drive purchase decisions for
virtual goods like online games and engaging activities on
Algorithms, Experimentation, Human factors
social media. However, the analysis presented by the author
is purely from a sociological perspective and the author does
Keywords not provide an algorithm for automating the above process.
Preference Mapping, Sentiment Scores, Product Attributes Recommendation algorithms similar to collaborative filter-
ing have been used for designing campaigns, but they rely
1. INTRODUCTION heavily on large amounts of existing customer preference
data available with the advertiser [11]. On a related note,
Motivation. Product manufacturers are always faced
they are also known to have limitations such as data spar-
with the dilemma of identifying which attribute(s) of their
sity and model scalability, which leads to poor recommenda-
products they should highlight in their targeted marketing
tions [2]. We provide a method for associating products with
campaigns. For example, a digital camera has several defin-
their marketable attributes that relate to each other based
ing aspects like power of zoom, size of display and image size
on publicly available sources. Such data sources may become
in megapixels. A release of a new camera model by a man-
accessible much before the advertiser receives direct informa-
ufacturer like Nikon will usually be followed by a marketing
tion about customers’ preferences based on product view or
Permission to make digital or hard copies of all or part of this work for personal or
product purchase data. Preference mapping is an approach
classroom use is granted without fee provided that copies are not made or distributed to identify customer preferences based on users’ surveys of
for profit or commercial advantage and that copies bear this notice and the full cita- product attributes. Individual user di↵erences are not aver-
tion on the first page. Copyrights for components of this work owned by others than aged, but are directly incorporated into the mapping model
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- and play vital roles in the preference fitting process [5]. As
publish, to post on servers or to redistribute to lists, requires prior specific permission of date, the technique has only been used for understanding
Copyright 2014 for
and/or a fee. Request the individual
permissions papers by the paper’s authors.
from permissions@acm.org.
Copying permitted
CBRecSys ’14, Octoberfor private
6, 2014, andValley,
Silicon academic purposes.
California, USA. This volume is
user preferences for diverse food items like lamb sausages [7],
published and ACM
Copyright 2014 copyrighted by its editors. ...$15.00.
978-1-4503-2257-7/14/07 lager beer [6] and vanilla ice cream [4]. We believe that this
http://dx.doi.org/10.1145/2600428.2609478.
CBRecSys 2014, October 6, 2014, Silicon Valley, CA, USA.
65
method has a far greater potential and can be readily ex- Functions V ar(·), E(·) and Cov(·) refer to the variance,
tended to unexplored application areas. expectation, and covariance functions, respectively, and the
Approach. In this research, after specifying our product j ’s represent the eigenvalues of the matrix X. These eigen-
and attribute set, we acquire sentiment scores of user reviews values have the corresponding eigenvectors as 1 , 2 , . . . , p
that mention attributes for the products in our set. Fol- (the number of eigenvectors is equal to the rank of the ma-
lowing this, we associate user sentiments with the attributes trix X). Then the ith PC for each product is the weighted
mentioned in the reviews (instead of the product as a whole) sum of the scores of the product across the attributes, the
and average them over reviewers who have written reviews weights being obtained from the ith eigenvector. A biplot
concerning the attributes. We perform preference mapping graph can be plotted for PC1 and PC2 with the weighted
on this processed dataset involving products, attributes and scores of each of the products and the eigenvector values for
average sentiment scores and generate a biplot visualization each attribute. The resultant graph provides an easily in-
that can be used for attribute recommendation. Finally, terpretable visualization that shows how products compare
we compare our recommendations with expert opinion and among each other based on customer reviews and the rela-
show that there is no perfect correlation with what experts tive proximity of each attribute to their respective products
believe to be good features and what consumers like in a with respect to associated positive user sentiment. Based
marketed product. on this multivariate visualization, marketing contents can
Organization. The rest of this paper is organized as be designed, highlighting favorable attributes for products.
follows. In Sec. 2, we describe our method of applying pref- A schematic of the steps a marketer will undergo to utilize
erence mapping to this situation. Next, we describe our data statistical analysis of social reviews to design product spe-
in Sec. 3 followed by experimental results and discussion in cific marketing campaigns is shown in Figure 1. Relevant
Sec. 4. Finally, we summarize our contribution and provide steps have been explained in this section. Specific details
directions for future work in Sec. 5. about our dataset and experimental setup will be provided
in the next section.
2. METHOD
We analyze a set of products p and a set of product at-
tributes k. Customers who have bought these products of-
ten go to the product or retailer website to provide feedback
about the product in the form of textual reviews. Most
of these reviews generally contain mentions of product at-
tributes. Further, positive or negative sentiments usually
accompany the above mentions of the attributes. In our ap-
proach, we collect reviews where each sentence talks about
only one attribute. Appropriate anaphora resolution is per-
formed for review sentences when the attribute name is not
directly mentioned [8]. Each sentence in each review is then
assigned a sentiment score. Since each sentence mentions Figure 1: A schematic of the steps in our use case:
exactly one attribute, the sentiment score associated with The steps in green are part of the workflow, while
the sentence is assumed to be the score associated with the those in blue are part of the proposed algorithm.
attribute. Note that the e↵ectiveness of our algorithm is
not a↵ected by the scale or range of this sentiment scoring.
Next, the scores are averaged over the reviewers for each
attribute for each product.
A preference mapping is then performed with the reviewer-
3. DATASET
averaged scores of each of the various attributes for the dif- We test our approach on a dataset consisting of 1309
ferent products. We now explain how this is performed. As reviews related to four digital camera models (Canon G3,
the first step, sentiment scores for all the product attributes Canon Powershot SD500, Canon S100, and Nikon Coolpix
are scaled to the same range so that variances are com- 4300), having a total of 13 distinct attributes. These at-
parable across attributes of each product. Consider X = tributes (or features) that we analyzed are: flash, zoom,
(X1 , X2 , . . . , Xp )T as the matrix of the reviewer-averaged battery, auto (quality of automatic mode), photo quality,
scores for the p products (say, di↵erent camera models) and view (quality of view through the viewfinder), delay (delay
the k attributes (like battery life, size of display and shutter between photos), look, start (startup speed), color, night
delay). Thus each Xi is a vector with its elements as Xij , (quality of night photos), lens and resolution. The reviews
which is the reviewer-averaged sentiment score for attribute are pre-processed to identify mentions of camera attributes
j of product i. The principal component (PC) transfor- within their texts. The 13 attributes are mentioned a total
mation of the feature vector X is the linear transformation of 583 times in the product reviews that we collected.
Y = T (X µ) where µ = E(X) and ⌃ = V ar(X) = 0
. Expert ratings. It is an interesting exercise to com-
The transformation is such that V ar(Y ) is maximized and pare our attribute recommendation system to expert opin-
the following holds: ion. To this end, we went through popular digital camera
review sites dcresource1 and imaging-resource2 for ex-
1 2 ... p tracting expert ratings on the thirteen attributes for our
1
where, V ar(Yj ) = j , j = 1, 2, . . . , p, E(Yj ) = 0 and http://www.dcresource.com, Accessed 11 July ’14.
2
Cov(Yj , Yi ) = 0 when i 6= j. http://www.imaging-resource.com, Accessed 11 July ’14.
65
Average Sentiment Scores
(PCA) is then performed on this matrix of camera-attribute
pairs. The PC1 and PC2 for this example, cumulatively ex-
plain 85% of the variability in the data. We then produce
the biplot of the weighted scores of the products and the
eigenvectors of each of the attributes, as shown in Figure 3.
Canon S100
delay color 2
flash
battery
function color
auto
Canon G3 lens
lens Canon G3 Canon PowerShot SD500
zoom 1
look battery
view
photo
resolution start function
resolution
PC2
0
start
auto
flash
−1
photo Nikon coolpix 4300
view look
zoom delay
−2
Canon PowerShot SD500
−2 0 2
Canon S100 Nikon coolpix 4300 PC1
Figure 3: (Color online) A biplot of the weighted
Figure 2: (Color online) Reviewer-averaged senti- scores of products and eigenvector attributes. At-
ment scores of attributes for our camera models. tributes are in red and product names are in gray.
four camera models. Since none of the popular camera re- This graph provides a lot of information for design of mar-
view sites provide direct numeric ratings for attributes, we keting campaigns. First, in the graph, two attributes (in red)
mapped expert opinion to a score of 1 or 2 depending upon that are pointing towards the same direction, are attributes
the comments provided. For example, comments containing that tend to be highly positively correlated. A product that
words like exceptional, excellent and good about an attribute is in the same direction as an attribute, has a high value
were mapped to two, and weak and worst were assumed to for this attribute. Thus, from the graph, we can conclude
be a one rating. The data that we collected has been made that attributes, which are closer and in the same direction
publicly available at http://goo.gl/v8BGj4. as a product, are the ones that should be recommended for
highlighting in marketing content for that particular model.
For example, Canon G3 and Canon S100 received high sen-
4. EXPERIMENTS AND RESULTS timent scores on attributes like lens and color, while Nikon
We assign a sentiment score to each sentence in each re- Coolpix 4300 and Canon PowerShot SD500 received high
view in our dataset with the Alchemy API3 and transfer the positive sentiments on low shutter delay and zoom quality.
score to the attribute mentioned in the sentence. The higher Thus, for example, lens and color should be recommended
the magnitude of the score, the stronger is the strength of for designing marketing content in the campaign for Canon
the associated sentiment. Following this, the positive and G3, rather than the zoom.
negative sentiment scores of all the 52 (= 13 ⇥ 4) camera- Second, this methodology also helps to contrast compet-
attribute pairs were averaged together over all the reviewers ing products simultaneously and provides competitive intel-
who mentioned the pair in his/her reviews, the neutral sen- ligence to the marketer. Thus, based on the given set of
timents contributing zero to the sum. The missing observa- consumers’ reviews, one can deduce that Nikon Coolpix 4300
tions are assumed to be neutral sentiments and hence the and Canon PowerShot SD500 are similar with respect to the
scores in such cases are assumed to be zero. These average attributes studied, as compared to Canon G3 and Canon
sentiments for each camera over all attributes are shown in S100. For example, if Nikon Coolpix 4300 and Canon Pow-
a radial chart in Figure 2. As a specific example, the bat- ershot SD500 are competing products, then it is meaninful
tery of the Canon S100 was mentioned in 13 reviews, with to recommend only discriminatory features that add value
seven, one, and five review(s) showing positive, negative and to a particular product for its campaign. It is more sensible
neutral scores respectively. While the numbers of positive to recommend flash for Nikon Coolpix 4300 (more closer to
and negative mentions seem comparable, the average posi- the model than Canon 500) than the zoom, which is approx-
tive and negative sentiment scores were found to be 1.3461 imately equidistant from the both the products.
and 0.3569 respectively, indicating that the strength of the Analysis of expert opinion. From the data collected on
negative sentiment was not as strong as the positive senti- expert comments (Sec. 3), we find that many of the discussed
ment. In our experiments, the two values were averaged to attributes are rated as 2, which implies that these attributes
obtain 0.8515. are “excellent” or “good” (Table 1). We assume that high
We now have a matrix with four rows (corresponding expert score is analogous to high positive sentiment.
to each camera model) and thirteen columns (correspond- Table 2 shows the Kendall-Tau rank correlation coeffi-
ing to each model attribute). The cells of this matrix are cients between the preference mapping technique and the
the reviewer-averaged sentiment scores associated with each plain average sentiment scores (which is the unweighted sum
camera and attribute pair. A principal component analysis of the attributes as opposed to the weighted sum for each
camera). For three cameras we have statistically significant
3
http://www.alchemyapi.com (at 0.05 level) correlation between the methods and a moder-
66
the potential customer and is likely to improve customer
Table 1: Proportion of Attributes Rated as Excel- satisfaction.
lent/Good and Poor. As future work, we would like to cluster products using at-
Camera Excellent/Good Poor tribute sentiment scores as features and observe the correla-
Canon G3 0.385 0.538 tion of the clustering output to the representation produced
Canon S100 0.615 0.231 by our preference mapping technique. Also, the quality of
Canon Powershot SD500 0.385 0.538 the reviews can be improved by choosing relevant users by
Nikon Coolpix 4300 0.615 0.385 mapping them to specific customer segments. This can lead
to better insights on the data and finer levels of control in
Expert ratings were not available for all the attributes. So the sum
of the values in a row may not add up to one the design of marketing content.
Acknowledgements
Table 2: Correlation between ranks of the attributes
We thank Ritwik Sinha from Adobe Research Labs India for
based on average sentiment scores and preference
valuable inputs at various stages of this work.
mapping scores.
Camera Kendall-Tau p-Value 6. REFERENCES
Canon G3 0.564 0.007 [1] M. I. Alpert. Identification of determinant attributes:
Canon S100 0.615 0.003 A comparison of methods. Journal of Marketing
Canon Powershot SD500 0.641 0.002 Research, pages 184–191, 1971.
Nikon Coolpix 4300 0.294 0.172 [2] Y. H. Cho, J. K. Kim, and S. H. Kim. A personalized
recommender system based on web usage mining and
decision tree induction. Expert Systems with
ate correlation for Nikon Coolpix 4300. This shows that our Applications, 23(3):329–342, 2002.
method has high correlation with the intuitive understand- [3] M. L. Cropper, L. Deck, N. Kishor, and K. E.
ing of the importance of the attributes and helps in further McConnell. Valuing product attributes using single
refinement. We could not observe any direct relation be- market data: a comparison of hedonic and discrete
tween the predictions based on the preference mapping and choice approaches. The Review of economics and
the attributes highly rated by experts. Statistics, pages 225–232, 1993.
[4] L. Dooley, Y. S. Lee, and J. F. Meullenet. The
application of check-all-that-apply (CATA) consumer
5. CONCLUSIONS AND FUTURE WORK profiling to preference mapping of vanilla ice cream
The preference mapping technique, as described by us in and its comparison to classical external preference
this research, recommends potentially “valuable” attributes mapping. Food quality and preference, 21(4):394–401,
of products to marketers for highlighting in a marketing 2010.
campaign. Our method provides the marketer the ability [5] K. Greenho↵ and H. MacFie. Preference mapping in
to design marketing content that can potentially increase practice. In H. MacFie and D. Thomson, editors,
response rates. We have used sentiment scores for product Measurement of Food Preferences, pages 137–166.
attributes, extracted from review texts to identify product Springer US, 1994.
features to be highlighted in campaigns. By focusing on at- [6] J. X. Guinard, B. Uotani, and P. Schlich. Internal and
tributes that are known to have received positive sentiments external mapping of preferences for commercial lager
of customers, the risk in the campaign is minimized. More- beers: comparison of hedonic ratings by consumers
over, the comparison with the experts’ comments suggests blind versus with knowledge of brand and price. Food
that sometimes, what customers value more about a prod- Quality and Preference, 12(4):243–255, 2001.
uct may be di↵erent from attributes that experts consider [7] H. Helgesen, R. Solheim, and T. NÃes.
of high quality. So, designing marketing content taking into , Consumer
preference mapping of dry fermented lamb sausages.
account what a large section of consumers show positive sen- Food Quality and Preference, 8(2):97–109, 1997.
timents towards may help in engaging more e↵ectively with
[8] S. Lappin and H. J. Leass. An algorithm for
a larger section of the consumers. The sentiment score in
pronominal anaphora resolution. Comput. Linguist.,
our research is a continuous variable and PCA has been used
20(4):535–561, Dec. 1994.
to identify appropriate attributes that have high scores. If
[9] S. Lê, J. Josse, F. Husson, et al. Factominer: an r
some or all the scores are categorical in nature, multi-factor
package for multivariate analysis. Journal of statistical
analysis [9] is preferable over PCA. The proposed technol-
software, 25(1):1–18, 2008.
ogy does not require large amounts of customer preference
data to be available internally with the advertiser (for ex- [10] V. Lehdonvirta. Virtual item sales as a revenue model:
ample, customers who have viewed the same product or cus- identifying attributes that drive purchase decisions.
tomers who have bought the same product), from their own Electronic Commerce Research, pages 97–113, 2009.
sales and browsing patterns. Rather, we use reviews that [11] G. Linden, B. Smith, and J. York. Amazon. com
directly reflect customer preferences. The reviews can be recommendations: Item-to-item collaborative filtering.
collected from any external source with consumers’ opinion. Internet Computing, IEEE, 7(1):76–80, 2003.
The other major strength of our approach is that it is more [12] L. Zhang and B. Liu. Identifying noun product
likely to be positively viewed by the future customer. Such features that imply opinions. In HLT ’11, pages
an approach enables having an informed conversation with 575–580, 2011.
67