SHORT PAPER Proceedings of the ACM RecSys 2010 Workshop on User-Centric Evaluation of Recommender Systems and Their Interfaces (UCERSTI), Barcelona, Spain, Sep 30, 2010 Published by CEUR-WS.org, ISSN 1613-0073, online ceur-ws.org/Vol-612/paper2.pdf Recommender Systems: Investigating the Impact of Recommendations on User Choices and Behaviors Robin Naughton, Xia Lin The iSchool at Drexel, College of Information Science and Technology 3141 Chestnut Street, Philadelphia, PA 19104 USA {rnaughton,xlin}@ischool.drexel.edu ABSTRACT study objective is to understand the impact of recommendations Recommender systems have been used in many information on user choices and behavior through the use of recommender systems, helping users handle information overload by providing systems, and this paper presents the results from an exploratory users with a way to receive specific recommendations that fulfill survey of users of a book recommender system, LibraryThing, their information seeking needs. Research in this area has been focusing on whether users follow the recommendations they focused on the recommender system algorithms and improving the receive and how those recommendations impact their choices, core technology so that recommendations are robust. However, particularly what users do as a result of getting a recommendation. little research is focused on the user-centered perspective of recommendations provided by recommender systems and the impact of recommendations on user’s information behaviors. In 2. LITERATURE REVIEW this paper, we describe the results of an exploratory survey study 2.1 Recommender Systems on a book recommender system, LibraryThing, and the impact of Resnick and Varian [10] chose to focus on the term recommendations on user choices, particularly what users do as a “recommender system” rather than “collaborative filtering” result of getting a recommendation. Based on survey respondents, because “recommender system” may or may not include our results indicate that users prefer member recommendations collaboration and it may suggest interesting items to users in rather than the algorithm-based automatic recommendations and addition to what should be filtered out. By using the term about two third of users that responded are influenced by the “recommender system,” it becomes clear that the system is not recommendations in their various information activities. just about the algorithm, but rather the overall goal. It also becomes an umbrella term for different types of recommender Categories and Subject Descriptors systems that uses various algorithms to achieve their goals. H5.3 [Information Interfaces and Presentations]: Group and Recommender systems can have algorithms that are constraint- Organization Interfaces collaborative computing, organizational based (question and answer conversational method) [3], content- design, web-based interaction based (CB) (item description comparison method), collaborative filtering (CF) (user ratings and taste similarity method), and hybrid (a combination of different algorithms) [7, 15]. The General Terms collaborative filtering technique has gained in popularity over the Computer applications, Design, Evaluation years [5] and the social networking aspects help to strengthen the filtering techniques. The hybrid technique combines collaborative Keywords filtering with content-based techniques to capitalize on the Recommender systems, user-centered design, survey study, user strength of each method. information behaviors 2.2 Evaluation of Recommender Systems Research on recommender systems algorithms is very 1. INTRODUCTION active and seeks to enhance current recommender systems. Recommender systems offer a solution to the problem However, as recommender systems improve, it is important that of information overload by providing a way for users to receive there is user-centered research on the evaluation of recommender specific information that fulfill their information needs. These systems. According to Herlocker, et al [5], “To date, there has systems help people make choices that will impact their daily lives been no published attempt to synthesize what is known about the and according to Resnick and Varian [10], “Recommender evaluation of recommender systems, nor to systematically Systems assist and augment this natural social process.” As more understand the implications of evaluating recommender systems information is produced, the need and growth of recommender for different tasks and different contexts.” Herlocker, et al [5] systems continue to increase. One can find recommender systems focused extensively on the problems of evaluating recommender in many domains ranging from movies (MovieLens.org) to books systems, presenting methods of analysis and experiments that (LibraryThing.com) to e-commerce (Amazon.com). Research into provides a framework for evaluation. Identifying three major this area is also growing to meet the demand, focusing on the core challenges, they point out that algorithms perform differently on recommender technology and evaluation of recommender different datasets, evaluation goals can differ, and deciding on algorithms. However, there’s a need for user-centered research measurement in comparative evaluation can be a challenge [5]. into recommender systems that looks beyond the algorithms to Hernandez del Olmo and Gaudioso [6] proposed an alternative people’s use of the recommendations and the impact of those evaluation framework for recommender systems that focuses on recommendations on people’s choices. With this in mind, the the goal of the recommender system. They indicate that there’s a Copyright © 2010 for the individual papers by the papers' authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors: Knijnenburg, B.P., Schmidt-Thieme, L., Bollen, D. 9 SHORT PAPER Proceedings of the ACM RecSys 2010 Workshop on User-Centric Evaluation of Recommender Systems and Their Interfaces (UCERSTI), Barcelona, Spain, Sep 30, 2010 Published by CEUR-WS.org, ISSN 1613-0073, online ceur-ws.org/Vol-612/paper2.pdf shift in the field to a broader and general definition of are submitted through a manual process that allows LibraryThing recommender systems that focuses on guiding users to users to submit recommendations for any book by going to the “useful/interesting objects” [6]. This redefining of the book’s recommendation page. The majority of recommendations recommender system goals also frames the redefining of the are automatic and for each book, LibraryThing offers six types of recommender system framework, implying that evaluation can be recommendations: 1) LibraryThing Combined Recommendations, based on goal achievement of guiding the user and providing 2) Special Sauce Recommendations, 3) Books with similar tags, useful/interesting items [6]. By dividing recommenders into these 4) People with this book also have... (more common), 5) People subsystems, the authors suggest that each recommender system with this book also have... (more obscure), and 6) Books with will have one of the two subsystems more active than the other similar library subjects and classification. Most of the titles of and the closer they are in terms of activity, the closer they are to the recommendation types are self-explanatory in that a user can achieving the global objective of the recommender system. easily get the general idea of the type of recommendations being The work of Herlocker, et. al [5] and Hernandez del offered. For example, the “LibraryThing Combined Olmo and Gaudioso [6] offer evaluation frameworks that function Recommendations” represents a combination of other types of across different domains and algorithms. However, they are still automatic recommendations. However, the “Special Sauce steps away from focusing on evaluating recommender systems Recommendations” seems to be the one title that is not self- from the user perspective. A few steps closer is research focused explanatory and offers no immediate understanding of what users on improving the user experience. Celma and Herrera [2] “Item- should expect. Spalding says, “Our Special Sauce and User-centric evaluation” methods to identify novel Recommendation engine is the only one we don’t talk about how recommendations based on CF and CB systems, and found that it works,” [11]. users perceive recommendations through CF are of higher quality “even though CF recommends less novel items than CB” [2]. O’Donovan and Smyth’s [8-9] research on trust in recommender systems defines two trust levels, context-specific and 4. RESEARCH DESIGN system/impersonal trust to help to create and preserve accuracy This study used an online survey (“LibraryThing and robustness within recommender systems. Ziegler and Recommendation Impact Survey”) to explore the impact of Golbeck’s [16] research into trust and interest similarity focused LibraryThing recommendations on user choices. No personal or on the link between trust and a person’ interest, concluding that identifying information was collected. There were 10 questions the more trust users have between each other, the more their using both open and closed question types. Two of the ten ratings are similar. Tintarev [13] and Tintarev and Masthoff [14] questions focused on capturing demographic data (gender and age argue for effective explanations that can increase user trust, help range) so that responses could be grouped within a larger context. users make good decisions and improve user experience. The other eight questions focused specifically on LibraryThing Although much of the research is based on improving recommendations and user preferences, influences and actions. the algorithms, the literature shows movement towards a focus on Before administering the survey, permission was obtained from the user. Tintarev and Masthoff [14] use of two focus groups to Tim Spalding, and an IRB approval from the University. determine how participants would like to be recommended or dissuaded from watching a movie indicate a change in the field 4.1 Implementation towards direct contact with users. Accuracy metrics of algorithms On October 27th, 2009, the recruitment letter with a link is not enough to determine the true impact on user choices. to the survey was posted to “Book Talk,” a LibraryThing group recommended by Tim Spalding as a place for major discussions. Spalding pointed out that postings can be tagged for spamming if posted to multiple groups and the goal was to reach the 3. LIBRARYTHING LibraryThing users rather than have the posting removed. Book recommender systems (LibraryThing, GoodReads, However, after a few weeks within the “Book Talk” group, the BookMooch, Amazon, All Consuming, Shelfari, etc.) allow users posting was added to the “Librarians who LibraryThing” group to catalogue books, and receive and share recommendations because they were one of the largest groups of LibraryThing within a social community. Since its launch in 2005, users, which helped with getting survey respondents. The posting LibraryThing has grown to over 920,000 users with the largest was repeatedly checked to make sure that it was still on the first group representing librarians, 45.5 million books have been page of the active group discussion and if it wasn’t, it was catalogued, and where some book recommender systems offer a adjusted to remain prominent to improve visibility and single algorithm, LibraryThing has multiple recommender opportunity for user response. The survey was posted on algorithms [1]. According to the founder, Tim Spalding, “We’ve LibraryThing for five months, from October, 27th, 2009 to March got five algorithms so far, and a few more I haven’t brought live, 27th, 2010. or which lie underneath the current ones. … LibraryThing’s data is particularly suited to it, the books you own being a much better representation of taste than the books you buy on a given retailer” 4.2 Participants Participants were 18 years and older who have [11]. It is a robust book recommender system with a strong social previously or were currently using LibraryThing that volunteered network that offers a fertile area for user-centered research. to take the survey by clicking the link to the survey from the LibraryThing users can add book titles to their accounts LibraryThing group. The expectation was that the survey may and receive book recommendations directly from LibraryThing receive about 100 self-selected respondents and within the five algorithms (automatic recommendations) or other users of the months, there were 62 survey respondents. website (member recommendations). Member recommendations Copyright © 2010 for the individual papers by the papers' authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors: Knijnenburg, B.P., Schmidt-Thieme, L., Bollen, D. 10 SHORT PAPER Proceedings of the ACM RecSys 2010 Workshop on User-Centric Evaluation of Recommender Systems and Their Interfaces (UCERSTI), Barcelona, Spain, Sep 30, 2010 Published by CEUR-WS.org, ISSN 1613-0073, online ceur-ws.org/Vol-612/paper2.pdf 5. RESULTS 5.2.1 Discussion The data gathered from the survey used descriptive The data suggested that twice as many participants statistics to generate percentages and iterative pattern coding of preferred member recommendations over automatic qualitative data to identify major themes [4]. recommendations. Based on reasons provided by participants, a distinction could be made between preferring member or 5.1 Demographic automatic recommendations. Participants that preferred member Two demographic questions (gender and age range) recommendations seemed to be interested in the social connection helped to frame the population responding to the survey. For between the recommendation and the recommender where they gender, there were 50 females (81%) and 12 males (19%) who were able to assess the recommender and recommendation as it responded to the survey. All age range groups had at least 3 relates to their own tastes. As one participant described, “Even participants. The 25-34 years old range accounted for 42% (26) though automatic recommendations may more ‘accurately of participants and the 45-54 years old range accounted for 26% measure’ my tastes and interests based upon the books I have in (16) of participants, representing the two largest groups my library, I feel recommendations from real human beings have responding to the survey. Overall, there were no age ranges that the advantage of the recommender's intuitive understanding of had zero participants, but the 55-64 age range was the only group what I would find interesting based upon their own impressions of with no male participants. books they know I've read.” Alternatively, participants that preferred automatic recommendations seem to be interested in the logical connection of the recommendation and user libraries 5.2 Member vs. Automatic Recommendations where the algorithm looks at all items. As one participant stated, In their own words, participants described their “I prefer automatic recommendations because they are based on preferences regarding automatic and member recommendations, all users with a particular book, not just on one member who and from the data five participant preference categories were thinks a book is like another.” In both cases, the preference for developed: automatic, member, both, neither, and no preference. member or automatic recommendations is influenced by the user’s Of the 62 participants that responded to the survey, the majority trust in particular aspects of the system, which has an impact on 48% (30) preferred member recommendations while only 24% the level of trust that the user has of the system and their fellow (15) preferred automatic recommendations. The other 28% (17) users. Research into trust models such as a user’s trust in another of the participants preferred neither, both or had no preference user based on that other user’s profile or a user’s trust in the (Table 1). system based on the items can begin to offer another dimension for developing recommendations [8-9]. The top preferences for automatic recommendations Table 1: User Recommendation Preferences (Table 2) suggest that LibraryThing users want recommendation User Preference # of Participants types that are additionally filtered (combined recommendations) Automatic 30 (48%) and socially connected (people also have). The other preferences Member 15 (24%) suggest that there may be overlap with the combined Neither 9 (15%) recommendations, lack of knowledge (“What is special sauce? I Both 4 (6.5%) missed that!”), or an alternative approach to getting No Preference 4 (6.5%) recommendations (“People whose library is similar to mine,” “Top 1,000 on my recommendations page,” “The stars, recommendations in forums”). In addition, there was an even split of participants (50%) between Since automatic and member recommendations present those who have submitted member recommendations and those different ways of getting recommendations within the system, as who have not. Participants were also asked to identify their expected, Table 1 shows that some participants preferred both preference for a specific type of LibraryThing automatic (6.5%) or had no preference (6.5%). However, the neither recommendation, the top two preferences were “LibraryThing category suggested that participants (15%) actively did not prefer Combined Recommendations” and “People with this book also automatic or member recommendations, but instead, preferred to have…. (more common)” (Table 2). get their recommendations from other sources such as message boards (“message boards on the site--it's much more useful for me Table 2: Users’ Most Valuable Automatic Recommendations to read another member's opinion about a book or to see a dialogue about a book on the message boards than to just see a # of list”) or chat (“The recommendations that I DO pay attention to, Automatic Recommendation Type Participants however, are the ones made personally from people I regularly 15 LibraryThing Combined Recommendations chat with on LT, and whose tastes I know I share”). The neither 14 People with this book also have… (more common) category presents an opportunity to understand why some 12 Other participants are not using the traditional automatic and member recommendations, and how recommender systems can be 9 Books with similar tags improved to service this population that seeks alternative methods 5 Special Sauce Recommendations of getting recommendations that combine multiple sources. 4 Books with similar library subjects and classification These results also suggest looking at the overall goal of the 3 People with this book also have... (more obscure) recommender system to identify how best to guide users and filter content appropriately to satisfy user wants and needs [6]. Copyright © 2010 for the individual papers by the papers' authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors: Knijnenburg, B.P., Schmidt-Thieme, L., Bollen, D. 11 SHORT PAPER Proceedings of the ACM RecSys 2010 Workshop on User-Centric Evaluation of Recommender Systems and Their Interfaces (UCERSTI), Barcelona, Spain, Sep 30, 2010 Published by CEUR-WS.org, ISSN 1613-0073, online ceur-ws.org/Vol-612/paper2.pdf 5.3 Recommendation Impact new LibraryThing recommendations (Table 3) and followed up on Users were asked if they checked their those recommendations by adding books to their libraries, recommendations, what they did with the information, and how it purchasing recommended books or putting recommended books influenced their choices. Table 3 shows that only 8 (13%) on a list to purchase, and browsed other user libraries with participants never checked their recommendations while 46 (74%) recommended book (Table 4). Table 5 shows 17 “Other” participants checked their recommendations daily, weekly or responses, suggesting a need for additional options for users to periodically. Most of the 8 (13%) participants that chose “Other” describe the influences of LibraryThing recommendations, such as checked their recommendations on a different schedule than what no influence, added to wishlist within or outside of LibraryThing, was presented in the survey question. borrowed from local library, and discovery research leading to additional information. Most participants, 46 (74%), found LibraryThing recommendations useful and stated that the recommendations helped them to find books they would not have Table 3: Frequency of User Checking Recommendations found otherwise. One participant pointed out the international User Checks # of Participants nature of LibraryThing, “Useful as an introduction to unknown Periodically 22 (35%) authors and series - particularly American titles - often difficult to Weekly 15 (24%) source in the UK.” Nine (15%) participants found the Daily 9 (15%) recommendations “somewhat” useful, and 7 (11%) participants Other 8 (13%) did not find recommendations useful. One participant stated, “I Never 8 (13%) suppose I feel the recommendations function is less useful because it doesn't account for shifting literary interests,” highlight an issue for user satisfaction and perceived usefulness. After checking their recommendations, 61% (38) of participants Perceived usefulness is another area of research that can read and followed-up on recommendations (Table 4). help to shed light on recommender systems from the user’s perspective. Swearingen and Sinha’s [12] research comparing online and offline recommendations, focused on perceived Table 4: Participant Follow-up on Recommendations usefulness and found that what mattered most was whether users got useful recommendations, the reason for using the Follow up # of Participants recommender system. Overall, LibraryThing participants Read and follow-up on recommendations 38 (61%) checked, followed, acted upon and found useful the Only read recommendations 6 (10%) recommendations they received from LibraryThing and on Never read or follow-up on 9 (15%) multiple questions, indicated the impact of recommendations on recommendations their choices. Other 9 (15%) Participants were asked to select specific actions that 6. LIMITATIONS & FUTURE they took as a result of recommendations and could select One limitation of this study is the self-selected nature of multiple responses to indicate the types of influence the the online survey, which limits the respondents to frequent users recommendations had on their choices. As a result, there were of LibraryThing who chose to respond to the survey. This can 167 responses, which exceed the number of participants (62), with create a self-selected group of users that do not represent the full an average of 2.7 responses per participant. Table 5 shows the range of LibraryThing users. As a consequence, the results are selection options and the number of responses per selection. not easily generalized to the larger population and an exploratory survey only scratches the surface of the user perspective. However, this research provides a valuable starting point for Table 5: Recommendation Influence future research into user experience with recommender systems, Recommendation Influence # of Responses particularly focusing on user preference, user actions and Added books to my library. 36 perceived usefulness of recommendations. Based on the themes Purchased the recommended book or added to 35 identified, future research would include creating a more robust a list for purchase. method of soliciting data directly from users and in-depth analysis Browsed user libraries that have the 31 of the “other” categories identified as these categories seem to recommended book indicate that users are using the system in unexpected ways, which in turn can help to improve recommender systems. Reminded you of something else. 29 Submitted a recommendation. 19 Other 17 7. CONCLUSION The main research goal of this study was to explore the 5.3.1 Discussion impact of recommendations through recommender systems on It was important to know whether users were actively user choices and behaviors, particularly what users did as a result engaging the recommender system or taking a passive approach of getting a recommendation. Much of the literature on by just reading whatever appears on the homepage. The data evaluation has focused on the algorithms [5-6], but research into show that a majority of the participants checked whether they had trust [8-9, 16], explanations [13-14], design and usefulness [12] Copyright © 2010 for the individual papers by the papers' authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors: Knijnenburg, B.P., Schmidt-Thieme, L., Bollen, D. 12 SHORT PAPER Proceedings of the ACM RecSys 2010 Workshop on User-Centric Evaluation of Recommender Systems and Their Interfaces (UCERSTI), Barcelona, Spain, Sep 30, 2010 Published by CEUR-WS.org, ISSN 1613-0073, online ceur-ws.org/Vol-612/paper2.pdf are getting closer to the user of the system. Understanding impact international conference on Intelligent user interfaces. 2006, directly from users is an important aspect of developing ACM: Sydney, Australia. p. 101-108. recommender system research on evaluation and this study has [10] Resnick, P. and H.R. Varian, Recommender systems. contributed to this effort. Commun. ACM, 1997. 40: p. 56-58. For LibraryThing, the results from this exploratory [11] Starr, J., LibraryThing.com: The Holy Grail of Book study indicate possible areas of improvement such as limiting Recommendation Engines, in Searcher. 2007. p. 25-32. automatic recommendation types because participants preferred [12] Swearingen, K. and R. Sinha. Beyond Algorithms: An HCI only 2-3 out of 6 automatic recommendation types, improving Perspective on Recommender Systems. in Proceedings in the submission of member recommendations because twice as many SIGIR 2001 Workshop on Recommender Systems. 2001. participants preferred member recommendations over automatic [13] Tintarev, N., Explanations of recommendations, in recommendations, and providing alternative recommendations Proceedings of the 2007 ACM conference on Recommender from other areas of LibraryThing because participants indicated a systems. 2007, ACM: Minneapolis, MN, USA. growing need to get recommendations from alternative sources [14] Tintarev, N. and J. Masthoff, Effective explanations of such as tags, message boards, and other areas of LibraryThing. recommendations: user-centered design, in Proceedings of the 2007 ACM conference on Recommender systems. 2007, The research has shown that twice as many participants ACM: Minneapolis, MN, USA. p. 153-156. preferred member recommendations over automatic [15] Torres, R., et al., Enhancing digital libraries with recommendations, and participants checked, followed-up, acted TechLens+, in Proceedings of the 4th ACM/IEEE-CS joint upon and found recommendations useful. The findings indicate conference on Digital libraries. 2004, ACM: Tuscon, AZ, that there’s more to uncover within the evaluation of USA. p. 228-236. recommender system and that users are an important aspect of [16] Ziegler, C.-N. and J. Golbeck, Investigating interactions of understanding whether recommender systems are indeed useful trust and interest similarity. Decision Support System, 2007. and impactful in people’s daily lives. 43(2): p. 460-475. 8. ACKNOWLEDGMENTS We thank Tim Spalding, LibraryThing Founder, and IMLS fellowship funding for making this research study possible. 9. REFERENCES [1] LibraryThing. Available from: http://www.librarything.com. [2] Celma, O. and P. Herrera, A new approach to evaluating novel recommendations, in Proceedings of the 2008 ACM conference on Recommender systems. 2008, ACM: Lausanne, Switzerland. p. 179-186. [3] Felfernig, A. and R. Burke, Constraint-based recommender systems: technologies and research issues, in Proceedings of the 10th international conference on Electronic commerce. 2008, ACM: Innsbruck, Austria. [4] Glaser, B.G. and A.L. Strauss, The constant comparative method of qualitative analysis, in The discovery of grounded theory: Strategies for qualitative research. 1967, Aldine de Gruyter: Hawthorne, NY. p. 101-115. [5] Herlocker, J.L., et al., Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst., 2004. 22(1): p. 5-53. [6] Hernández del Olmo, F. and E. Gaudioso, Evaluation of recommender systems: A new approach. Expert Systems with Applications, 2008. 35(3): p. 790-804. [7] Huang, Z., et al., A graph-based recommender system for digital library, in Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries. 2002, ACM: Portland, Oregon, USA. p. 65-73. [8] O'Donovan, J. and B. Smyth, Trust in recommender systems, in Proceedings of the 10th international conference on Intelligent user interfaces. 2005, ACM: San Diego, California, USA. p. 167-174. [9] O'Donovan, J. and B. Smyth, Is trust robust?: an analysis of trust-based recommendation, in Proceedings of the 11th Copyright © 2010 for the individual papers by the papers' authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors: Knijnenburg, B.P., Schmidt-Thieme, L., Bollen, D. 13