Personal Values-based User Modeling from Browsing History of Reviews Yasufumi Takama Suzuto Shimizu Hiroshi Ishikawa Tokyo Metropolitan University Tokyo Metropolitan University Tokyo Metropolitan University Tokyo, Japan Tokyo, Japan Tokyo, Japan ytakama@tmu.ac.jp ABSTRACT The CF is one of common and successful approaches of This paper proposes a user modeling method from user’s recommendation, and those variations and extensions have browsing history of reviews. Personal values-based recom- been studied by many researchers. Variations include item- mendation method has been proposed, which models users’ based[10], matrix factorization-based[8, 9, 16], and graph- personal values as the effect of item’s attribute on their de- based approaches[2]. Extensions include introduction of ad- cision making. While existing method obtains a user model ditional information for calculating inter-user similarity[1, from reviews posted by a user, this paper proposes to obtain it 13, 11]. This paper focuses on one of those extensions: intro- from reviews a user consulted for decision making. In order duction of personal values[4]. Personal values and personali- to identify an attribute that affects on user’s decision making ties are supposed to be important factors in decision making, efficiently, the proposed method dynamically selects reviews and they have recently received attention by those studying mentioning attributes on which a user might put priority and recommendation[4, 12]. In particular, the Rating Matching presented to the user. A method for selecting items to recom- Rate (RMRate), which estimates the effect of an item’s at- mend based on the obtained user models is also proposed. An tributes on a user’s rating[4], has been proposed for model- experimental result with test participants shows the effective- ing users’ personal values. Its effectiveness for recommenda- ness of the proposed method. tion has been shown in terms of content-based approach[4], CF[17, 18], and item modeling[19]. ACM Classification Keywords Existing studies obtain user models based on RMRate (called H.5.m. Information Interfaces and Presentation (e.g. HCI): PV model hereinafter) from reviews posted by target users, Miscellaneous which limits its applicable situations. That is, it can be only applied to online review sites with attribute-level evaluations. Author Keywords Even though attribute-level evaluations are available, major- Recommender system; personal values; user modeling; ity of users on online review sites seldom post reviews. The online reviews. PV model cannot be obtained for such users. This paper focuses on the latter problem. In order to calculate INTRODUCTION PV model for users posting no review, this paper proposes a This paper proposes to obtain user models reflecting their per- method to obtain it from users’ histories of browsing reviews sonal values by analyzing their record of browsing online re- posted by others. A method for recommending items based views. The obtained models are used for recommendation. on the obtained PV models is also proposed, and those effec- tiveness are shown by experiments with test participants. In recent years, users have made huge numbers of reviews and ratings online. Such social big data[5] can be utilized for enriching our lives in various ways, including recommenda- RELATED WORKS tion. In order to promote products, it is necessary to establish This section briefly introduces studies utilizing personality a method for predicting users’ preferences and recommend- and personal values for recommender systems. Personal val- ing suitable items to them. As ratings are supposed to reflect ues and personality determine the characteristics of a user’s users’ opinions about items, they can be used to estimate their decision making, and they have been used in marketing. preferences. Collaborative filtering (CF)[14] and its related Jayawardhena modeled a hierarchical relationship among algorithms are based on this idea. personal values, attitudes, and behaviors in e-shopping[6]. Wu et al. proposed a method for recommending diversified items in terms of the most important attributes[20]. In their study, the degree of diversity is determined from the relation- ship between the user’s personality and his/her needs for di- versity. These studies have shown that personal values are one of ©2018. Copyright for the individual papers remains with the authors. the main factors affecting consumption habits. However, Copying permitted for private and academic purposes. they model users’ personal values and personality with ab- WII’18, March 11, 2018, Tokyo, Japan stract factors such as the Rokeach Value Survey[15] and Big Actually, extracting mentioned attributes with sentiment from Five[7], which have no intuitive relationship with the items to reviews accurately is difficult even with the state-of-the art be recommended[12]. text mining techniques[21]. Instead of applying text mining techniques, this paper utilizes attribute-level evaluations at- As a more direct approach, Hattori et al.[4] have proposed tached to reviews. That is, this paper supposes online review a personal values-based user modeling using Rate Matching sites which have attribute-level evaluations. As a review ex- Rate (RMRate). A user’s personal values are modeled as the plains its reviewer’s opinion about a target item, it is assumed effect each attribute an item has on his/her decisions. Given that a reviewer makes positive comment on an attribute if s/he data including users’ item-level evaluation (i.e. rating) and positively evaluates it. attribute-level evaluation, the RMRate of ui relative to an at- tribute ak is calculated as This paper considers that reviews to be presented to users for obtaining their feedback should satisfy the following condi- tions. ∑x j ∈Ii δ (pi j , pkij ) RMRik = , (1) 1. Polarity of an opinion about an attribute mentioned in a re- |Ii | view is the same as the polarity of attribute-level evaluation explicitly given by a user. where Ii is a set of items rated by ui , pi j is the polarity of item- level evaluation (positive or negative) of ui on item x j , pkij is 2. A review mentions some attributes as evidence of evalua- tion. the porality of attribute-level evaluation of ui on ak of x j . The function δ (x, y) returns 1 if x is equal to y, 0 otherwise. 3. Polarity of evaluations of all attributes are not be the same. The personal values-based CF[18] calculates inter-user simi- The first condition is required to guarantee the above- larity on the basis of PV models. Given a set of attributes of mentioned assumption. The proposed method supposes that an item (A), a PV model of ui is represented as |A|-th dimen- users make a decision by reading reviews. Therefore, if the sional vector, which consists of RMRik (ak ∈ A). Pearson cor- second condition is not satisfied, a user reading a review can- relation between PV models is calculated among users, which not understand the reason why a reviewer made such an eval- is used to find neighborhood users. uation for attributes. The third condition is considered to identify attributes focused by a user. One of advantages of the personal values-based CF is that a matrix used for calculating inter-user similarity tends to be As it is difficult to automatically collect reviews satisfying dense compared with user-item matrix, because the number these conditions with high accuracy, we manually examined of attributes of an item is usually much smaller than that of collected reviews and constructed a database. items. Therefore, the number of users to which the similarity to a target user can be calculated is expected to be large. Modeling with dynamic review presentation The proposed modeling process is shown in Fig. 1. From PV MODELING FROM BROWSING HISTORY the constructed database, a set of reviews is selected and pre- Outline of proposed approach sented to users to obtain their feedback. In this paper, 3 re- views are presented to users at the same time. A user feed- In order to obtain PV model, not only item-level evaluation back includes the user’s rating to the item (5-point scale, bi- of a target user on items, but also attribute-level evaluations nary, etc.) and one review that s/he think is the most helpful are necessary. Instead of analyzing reviews posted by target to determine the rating. Based on these feedback, RMRate of users, as done by existing studies, this paper tries to estimate attributes are updated. That is, polarity of user’s rating corre- users’ personal values from their history of browsing reviews. sponds to pi j in Eq. (1), and that of attribute-level evaluation Note that this section uses a term ‘user’ as a person for which attached to a review corresponds to pkij . a user model is obtained; ‘reviewer’ is used as a person who posted reviews. Let us consider the case that a user is going An important thing to consider in this algorithm is how to to make a decision on whether or not to buy a certain camera determine reviews which are presented to users. It is incon- by reference to the following 3 reviews. venient for users if they have to interact with recommender systems many times before receiving recommended items. 1. The image quality of this camera is good. Therefore, this paper aims to identify at least one attribute on 2. It is easy to operate this camera with a single touch of but- which a user would put priority for his/her decision making tons. as soon as possible. Even though complete PV model is not obtained, recommender systems could start recommendation 3. This camera is lightweight and suitable for bringing it any- based on a single attribute on which a user put priority. where. For the first loop, reviews are randomly selected from the If this user decides to buy this camera following the first re- database so that every attribute can be mentioned in at least view, s/he is supposed to put priority on image quality when one of those reviews. In the subsequent loops, reviews are s/he evaluates cameras. Therefore, RMRate can be calculated selected so as to satisfy the following conditions. Here, target by identifying attributes mentioned as positive / negative in attribute means an attribute of which RMRate at this time is reviews. the highest among all attributes. Start Scorei (x j , ui , c j ) = ∑ {ekj − ekc j } · RMR2ik , (3) Present reviews ak ∈Ai { } ∑al ∈A RMRil Ai = ak |RMRik ≥ , (4) User judgment |A| Select reviews Update RMRate where c j is an item category to which x j belongs, ekj is average evaluation for ak of x j , ekc j is average evaluation for ak of Repeat? items belonging to c j . As these average evaluations, we used the values released on the online review site. End The score is calculated based on only the attributes of which target user’s RMRate is higher than average of his / her RM- Figure 1. Procedure of modeling process. Rate for all attributes (Eq. (4)). We employ it in order to focus on attributes which strongly affect user’s decision mak- ing. For the same reason, we use RMRate squared for the 1. Present at least one review that positively evaluates target calculation. attribute. EXPERIMENTS 2. Present at least one review that negatively evaluates target attribute. Settings An experiment with test participants is conducted. The ex- 3. Reviews should have the highest score calculated as Eq. periment is divided into two phases: user modeling and rec- (2) while satisfying conditions 1 and 2. ommendation phases. We asked 20 graduate / undergraduate students in engineering field to take part in the experiment. ∑k |ekr − er | · RMR2ik In user modeling phase, proposed dynamic review presenta- Scorer (r, ui ) = log Nr , (2) Kr tion method is compared with random presentation method. In both methods, 3 reviews about different hotels are com- where ui is a user, r is a review, ekr is evaluation of r to an bined into one set. Test participants were asked to evaluate attribute k, and er is r’s average evaluation over all attributes. different 20 sets as if they were going to book a hotel for the The Nr and Kr are the number of characters and mentioned at- specified purpose. tributes in r, respectively. This equation gives high score for a Reviews and hotel information were collected from online review when evaluation to the attribute, of which current RM- hotel review site 4travel1 . The number of collected reviews is Rate is high, is higher / lower than other attributes. The Kr in 592. Regarding polarity of attribute-level evaluation, which denominator plays a role to give priority on reviews focusing is required for calculating RMRate, average evaluation over on specific attributes. Equation (2) also considers the length all attributes is calculated for each review. If evaluation of an (number of characters) of reviews, because we found in the attribute is equal to or more than the average, it is regarded as preliminary experiment that users tended to consult longer re- positive evaluation, and vice versa. The 4travel employs 7 at- views than shorter ones. tributes: access, cost performance (CP), service, room, bath, The RMRate is calculated based on the correspondence of meal, and barrier-free. As it is supposed that whether a hotels polarity between item-level evaluation (rating) and attribute- is barrier-free or not would not affect decision making of test level evaluation. Therefore, presenting reviews satisfying participants in this experiment, we removed it. conditions 1 and 2 aims to obtain a feedback regarding We supposed two purpose of booking hotels, i.e. for business whether or not polarity of attribute-level evaluation is the and sightseeing, and prepared two datasets for each purpose. same as that of his / her rating to target item. As a termi- The test participants were divided into 4 groups (5 persons nation condition, we decide to repeat presenting reviews 20 each) as shown in Table 1. We designed the experiment so times. that hotels in different area are presented in different presenta- tion method. As the purpose of booking hotels is supposed to Recommender system based on PV models affect participants’ decision making, datasets used for a par- This subsection describes a recommender system based on ticipant belong to the same purpose for keeping consistency PV models obtained as described in the previous subsection. of his/her evaluation. The order of presentation methods was A straightforward approach is to recommend items to which rotated so as to remove the order effect. predicted rating for a user is higher than others. Instead of predicting ratings, this paper proposes to estimate a degree of In recommendation phase, 10 hotels are selected based on a recommendation for an item based on user’s PV model. user model obtained by each presentation method. For the comparison purpose, additional 10 hotels are also selected Given a set of RMRate of a user ui ({RMRik |ak ∈ A}), a score of an item x j is defined as follows. 1 http://4travel.jp/ Group Dynamic Random Presentation method RMRate ≥ 10 <10 SightseeingA Tokyo, Hokkaido Dynamic ≥ 0.7 28 2 Kanagawa presentation ≥ 0.8 17 0 SightseeingB Hokkaido Tokyo, Random ≥ 0.7 13 21 Kanagawa presentation ≥ 0.8 9 12 BusinessA Osaka, Kyoto Tokyo, Aichi, Table 4. Number of selected reviews. Fukuoka BusinessB Tokyo, Aichi, Osaka, Kyoto Purpose Dynamic Random Satisfaction Fukuoka Sightseeing 0.720 0.800 0.670 Table 1. Used dataset for modeling. Business 0.630 0.710 0.580 Total 0.675 0.755 0.625 Table 5. Comparison of precision Group Dynamic Random SightseeingA Osaka, Hyogo, Okinawa Kyoto views than random presentation method. It means that when SightseeingB Okinawa Osaka, Hyogo, an attribute has high RMRate, dynamic presentation method Kyoto estimates it based on enough information compared with ran- BusinessA Kanagawa Hyogo, Kyoto dom presentation method. BusinessB Hyogo, Kyoto Kanagawa Table 2. Used dataset for recommendation. Result of Recommendation Table 5 shows average precision: the ratio of items test partic- ipants judged as positive to all recommended items. Both of based on review site’s satisfaction ranking. Therefore, each dynamic and random presentation methods achieved higher participant was asked to evaluate at most 30 hotels; if differ- precision than satisfaction ranking regardless of purpose of ent methods select the same hotels, the number of presented booking hotels. This result shows the effectiveness of model- hotels is less than 30. The order of presenting items was shuf- ing users’ personal values from browsing histories of reviews. fled so that the participants could not know by which method It is also shown that precision by dynamic presentation (model) a hotel was selected. We prepared different datasets method is lower than that by random method. This result cor- from modeling phase as shown in Table 2. In the dataset, responds to the fact that dynamic presentation method puts we removed hotels which were evaluated as 4 or more for priority on fast estimation rather than exhaustive estimation. all attributes, as such hotels are preferred by almost every- That is, identifying attributes with high RMRate as many as one regardless of their personal values. For each of presented possible is expected to be effective in terms of accuracy. hotels, test participants were asked to evaluate it as either pos- itive or negative. CONCLUSION Result of User modeling This paper proposed a method for obtaining personal values- based user models from user’s browsing history of reviews. After the experiment, test participants were asked to answer The proposed method dynamically selects and presents re- attributes which they concerned. Table 3 shows average RM- views mentioning attributes on which a user might put pri- Rate over attributes they concerned. The table shows that ority. A method for selecting items to recommend based on average RMRate by random presentation method is higher the obtained user models was also proposed. An experimen- than that of dynamic presentation method for all groups. It tal result with test participants shows user models obtained is because dynamic presentation method focuses on specific from browsing history achieved higher recommendation ac- attributes, and estimation for other attributes is not enough curacy than recommendation based on a review site’s satis- compared with random presentation method. faction ranking. It is also shown that proposed dynamic pre- Table 4 compares the number of reviews selected by test par- sentation method is effective for identifying specific attributes ticipants. The number of selected reviews is counted for each of high RMRate from relatively many reviews. attribute of which RMRates is relatively high: 0.7 or more As the number of read-only users is much larger than those (≥ 0.7) / 0.8 or more (≥ 0.8). Each cell shows the number of posting reviews, the proposed method will contribute to ex- attributes, for which 10 or more (≥ 10) / less than 10 (< 10) tend the applicability of personal values-based recommender reviews were respectively selected. The table shows that dy- systems. Future work includes application to other kinds of namic presentation method estimates RMRate from much re- items, as well as automatic collection of reviews to be used for user modeling. Group Dynamic Random SightseeingA 0.536 0.547 ACKNOWLEDGMENTS SightseeingB 0.567 0.675 This work was partly supported by Grant-in-Aid for Research BusinessA 0.483 0.636 on Priority Areas, Tokyo Metropolitan University, “Research BusinessB 0.538 0.744 on social big data" and JSPS KAKENHI Grant Numbers Table 3. Average RMRate for concerned attributes. JP15H02780 and JP16K12535. REFERENCES 12. Maria Augusta S.N. Nunes and Rong Hu. 2012. 1. JesúS Bobadilla, Fernando Ortega, Antonio Hernando, Personality-based recommender systems: an overview. and JesúS Bernal. 2012. A collaborative filtering In Proceedings of the sixth ACM conference on approach to mitigate the new user cold start problem. Recommender systems (RecSys’12). ACM, New York, Know.-Based Syst. 26 (February 2012), 225-238. NY, USA, 5-6. DOI=http://dx.doi.org/10.1016/j.knosys.2011.07.021 DOI=http://dx.doi.org/10.1145/2365952.2365957 2. Francois Fouss, Alain Pirotte, Jean-Michel Renders, and 13. S.-T. Park, D. Pennock, O. Madani, N. Good, and D. Marco Saerens. 2007. Random-Walk Computation of DeCoste, “Naïve Filterbots for Robust Cold-start Similarities between Nodes of a Graph with Application Recommendations,” KDD’06, pp. 699–705, 2006. to Collaborative Recommendation. IEEE Trans. on 14. Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Knowl. and Data Eng. 19, 3 (March 2007), 355-369. Bergstrom, and John Riedl. 1994. GroupLens: an open DOI=http://dx.doi.org/10.1109/TKDE.2007.46 architecture for collaborative filtering of netnews. In 3. Mustansar Ali Ghazanfar and Adam Prügel-Bennett. Proceedings of the 1994 ACM conference on Computer 2010. An Improved Switching Hybrid Recommender supported cooperative work (CSCW ’94). ACM, New System Using Naive Bayes Classifier and Collaborative York, NY, USA, 175-186. Filtering. In Proceedings of The International Multi DOI=http://dx.doi.org/10.1145/192844.192905 Conference of Engineers and Computer Scientists, 1, 15. Milton Rokeach. 1973. The Nature of Human Values. 493–502. New York: The Free Press. 4. Shunichi Hattori and Yasufumi Takama. 2014. 16. Ruslan Salakhutdinov and Andriy Mnih. 2007. Recommender System Employing Probabilistic Matrix Factorization. In Proceedings of the Personal-Value-Based User Model. Journal of Advanced 20th International Conference on Neural Information Computational Intelligence and Intelligent Informatics, Processing Systems (NIPS’07), J. C. Platt, D. Koller, Y. 18, 2, 157–165. Singer, and S. T. Roweis (Eds.). Curran Associates Inc., DOI=http://dx.doi.org/10.20965/jaciii.2014.p0157. USA, 1257-1264. 5. Hiroshi Ishikawa. 2015. Social Big Data Mining. CRC 17. Yasufumi Takama, Yu-Sheng Chen, Ryori Misawa, and Press. Hiroshi Ishikawa. 2017. Potential of Personal Values-Based User Modeling for Long Tail Item 6. Chanaka Jayawardhena. 2004. Personal values’ Recommendation. In Proceedings of 2017 International influence on e-shopping attitude and behaviour. Internet Workshop on Advanced Computational Intelligence and Research, 14, 2, 127–138. Intelligent Informatics (IWACIII 2017), AS1-3.2. 7. Paul T. Costa and Robert R. McCrae. 1992. Revised 18. Yasufumi Takama, Ryori Misawa, Yu-Sheng Chen, NEO Personality Inventory (NEO-PI-R) and NEO Shunichi Hattori and Hiroshi Ishikawa. 2016. Proposal Five-Factor Inventory (NEO-FFI). Psychological of Hybrid Recommender Systems Based on Personal Assessment Resources. Values-based Collaborative Filtering. In Proceedings of the 7th International Symposium on Computational 8. Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Intelligence and Industrial Applications (ISCIIA2016), Matrix Factorization Techniques for Recommender SM-GS3-01. Systems. Computer 42, 8 (August 2009), 30-37. DOI=10.1109/MC.2009.263 19. Yasufumi Takama, Takayuki Yamaguchi, and Shunichi http://dx.doi.org/10.1109/MC.2009.263 Hatori. 2016. Personal Values-Based Item Modeling and its Application to Recommendation with Explanation. 9. Daniel D. Lee and H. Sebastian Seung. 2000. Journal of Advanced Computational Intelligence and Algorithms for non-negative matrix factorization. In Intelligent Informatics, 20, 6, 867–874. Proceedings of the 13th International Conference on DOI=http://dx.doi.org/10.20965/jaciii.2016.p0867 Neural Information Processing Systems (NIPS’00), T. K. Leen, T. G. Dietterich, and V. Tresp (Eds.). MIT 20. Wen Wu, Li Chen, and Liang He. 2013. Using Press, Cambridge, MA, USA, 535-541. personality to adjust diversity in recommender systems. In Proceedings of the 24th ACM Conference on 10. Greg Linden, Brent Smith, and Jeremy York. 2003. Hypertext and Social Media (HT’13). ACM, New York, Amazon.com Recommendations: Item-to-Item NY, USA, 225-229. Collaborative Filtering. IEEE Internet Computing 7, 1 DOI=http://dx.doi.org/10.1145/2481492.2481521 (January 2003), 76-80. 21. Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi DOI=http://dx.doi.org/10.1109/MIC.2003.1167344 Cheng. 2013. A biterm topic model for short texts. In 11. Paolo Massa and Paolo Avesani. 2004. Trust-aware Proceedings of the 22nd international conference on Collaborative Filtering for Recommender Systems. World Wide Web (WWW ’13). ACM, New York, NY, Lecture Notes in Computer Science, 3290, 492–508. USA, 1445-1456. DOI: DOI=https://doi.org/10.1007/978-3-540-30468-5_31 https://doi.org/10.1145/2488388.2488514