=Paper=
{{Paper
|id=Vol-2450/short5
|storemode=property
|title=Does the User Have A Theory of the Recommender? A Pilot Study
|pdfUrl=https://ceur-ws.org/Vol-2450/short5.pdf
|volume=Vol-2450
|authors=Muheeb Faizan Ghori,Arman Dehpanah,Jonathan Gemmell,Hamed Qahri-Saremi,Bamshad Mobasher
|dblpUrl=https://dblp.org/rec/conf/recsys/GhoriDGSM19
}}
==Does the User Have A Theory of the Recommender? A Pilot Study==
Does the User Have A Theory of the Recommender? A Pilot Study Muheeb Faizan Ghori∗ Arman Dehpanah Jonathan Gemmell School of Computing School of Computing School of Computing DePaul University DePaul University DePaul University Chicago, Illinois Chicago, Illinois Chicago, Illinois mghori2@depaul.edu adehpana@depaul.edu jgemmell@cdm.depaul.edu Hamed Qahri-Saremi Bamshad Mobasher School of Computing School of Computing DePaul University DePaul University Chicago, Illinois Chicago, Illinois hamed.saremi@depaul.edu mobasher@depaul.edu ABSTRACT 1 INTRODUCTION Recommender systems have become a mainstay of modern inter- Recommender systems help users find items in large and com- net applications. They help users identify products to purchase plex information spaces such as those found in online retailers or on Amazon, movies to watch on Netflix and songs to enjoy on streaming services. These systems have become an integral tool Pandora. Indeed, they have become so commonplace that users, in modern internet applications helping users cope with the in- through years of interactions with these systems, have developed creasing complexity of online environments and improving the an inherent understanding of how recommender systems function, company’s competitive edge. Recommender systems often leverage what their objectives are, and how the user might manipulate them. user information and interactions with the system to provide per- We describe this understanding as the Theory of the Recommender. sonalized recommendations that satisfy the needs and preferences In this pilot study, we design and administer a survey to 25 users of the user. familiar with recommender systems. Our detailed analysis of their These systems use several techniques to generate recommen- responses demonstrates that they possess an awareness of how dations. Common approaches include collaborative filtering [16], recommender systems profile the user, build representations for content-based filtering [24], and model-based methods [2]. items, and ultimately construct recommendations. The success of In the past decade, these technologies have become ubiquitous. this pilot study provides support for a larger user study and the Consequently, modern internet users are exposed to recommender development of a grounded theory to describe the user’s cognitive systems on a daily basis. They view recommended items, consume model of how recommender systems function. items that catch their interest, and perhaps rate or leave feedback about these items. These repeated interactions may have led to an CCS CONCEPTS inherent understanding of how recommender systems work. In this paper, we ask the question: Does the user possess a theory • Human-centered computing → User models; • Retrieval of the recommender? The title of this paper is inspired by Premack tasks and goals → Information extraction; • Computing method- and Woodruff’s seminal paper [28] asking if chimpanzees under- ologies → Cognitive science. stand the goals, perceptions, knowledge, and beliefs of others. In a similar vein, we want to ascertain if users understand the goals, KEYWORDS perceptions, knowledge, and beliefs of recommender systems. We recommender systems, cognitive models, qualitative research hypothesize that as recommender systems have become more com- monplace and sophisticated, so too has the user’s understanding of ACM Reference Format: the recommenders. Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, Hamed Qahri- A study into the user’s understanding of recommender systems Saremi, and Bamshad Mobasher. 2019. Does the User Have A Theory of is critical for many reasons, among them: 1) the development of a the Recommender? A Pilot Study. In Proceedings of Joint Workshop on framework for understanding the user’s cognitive model of how Interfaces and Human Decision Making for Recommender Systems (IntRS recommender systems work, 2) predicting what behaviors such a ’19), CEUR-WS.org, 9 pages. cognitive model would elicit, and 3) designing systems that can identify and leverage these behaviors, thereby increasing the per- formance and value of recommender systems. IntRS Workshop, September, 2019, Copenhagen, Dk To test our hypothesis, we design and conduct a user study Copyright © 2019 for this paper by its authors. Use permitted under Creative Com- to identify concepts related to the participant’s understanding of mons License Attribution 4.0 International (CC BY 4.0). IntRS ’19: Joint Workshop on Interfaces and Human Decision Making for Recommender Systems, 19 Sept 2019, recommenders given several common scenarios. The identification Copenhagen, DK IntRS Workshop, September, 2019, Copenhagen, Dk Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, Hamed Qahri-Saremi, and Bamshad Mobasher of these concepts in this pilot study is a first step toward developing On the other hand, content-based filtering systems [6, 24] learn a robust grounded theory, describing the user’s cognitive model. a profile for each user based on features of the items previously Grounded theory allows the construction of theories that are consumed, such as the genre of a book or actor in a movie. Similarly, grounded in empirical observations or data. It uses a constant com- item profiles are created by characterizing the items based on their parative method where each observation is compared with others attributes and features. The system generates recommendations by to find similarities and differences to generate concepts, hypothesis, comparing the description of new items to the user’s preference and relationships to explain behavior and processes. profile. For example, it may recommend a new sci-fi movie to a user As a first step toward developing a rich grounded theory, we that had been identified as a fan of science fiction. design and administer a survey instrument. The instrument con- In contrast, model-based methods [2] train a model for each user sists of several questions probing the subject’s understanding of based on prior user preferences. Several model-based approaches recommender systems. Users are presented with scenarios typically exist, but in general, the goal is to predict the likelihood of new encountered at online retailers, streaming services and news ag- items to be of interest to the target user. Examples include Matrix gregators. They are then asked to answer questions based on their Factorization [19] and Singular Value Decomposition [3]. knowledge and personal experience. Our primary measure of user We speculate that as recommender systems have become more perception is the response to questions such as “Explain how you commonplace and popular, users may have developed a basic un- think the system recommended this item for the user?” derstanding of how they work. We liken this understanding to We administered the survey instrument to 25 participants in a the Theory of Mind [28]. Theory of Mind is the cognitive capacity pilot study through Amazon’s Mechanical Turk1 . Our results show to perceive and predict other people’s behavior in terms of their that the participants appear to possess a cognitive model of recom- mental states. mender systems. For example, while the participants did not use Frith et al. [13] explained that the behavior of people can be terms such as “user representation" or “collaborative filtering”, they understood on the basis of their minds: their knowledge, beliefs, often describe a system that “keeps track of the user’s purchases” desires, and intentions. Moreover, people engaged in social life and “recommends items similar to those she had purchased before.” attribute various knowledge, beliefs, desires, and intentions to oth- The results of this pilot study will play an important role in ers [15]. Such attributions are useful in analyzing, judging, and improving the survey instrument and evaluating the design, feasi- inferring other person’s behavior. This ability is a fundamental bility, cost and time of conducting a larger more thorough study aspect of social cognition that guides an individual’s behavior in a with the goal of using grounded theory to describe the user’s cogni- society. tive model of recommender systems. Future work will address the In order to understand the user’s perception of how recom- limitations revealed by this study and will explore how users mod- mender systems work, what we refer to as a Theory of the Recom- ify their behavior based on their understanding of recommender mender, we are proposing the development of a grounded theory. systems, what impact such behavior might have on these systems, Developed by Glaser and Strauss, ‘grounded theory is a general and how recommender systems might be designed to cope with, or methodology for developing a theory that is grounded in data even leverage, these behaviors. systematically gathered and analyzed’ [14]. This approach uses The rest of this paper is organized as follows. In Section 2 we a constant comparative method where each observation is com- present our related work. A brief summary of common recom- pared with others to find similarities and differences to generate mendation algorithms is offered, and established techniques for concepts, hypothesis, and relationships that best explains behavior conducting surveys are provided. In Section 3 our methodology is and processes [7]. described in detail. Section 4 presents the survey instrument. The Grounded theory methods allow elaboration and modification of results of the survey are described in Section 5. Here we provide re- existing theories or generation of new theories from the collected occurring patterns observed in the survey responses, a deliberation data [10]. This method is iterative. The emerging theory is incre- of important limitations and directions for future work before turn- mentally refined based on recurring data collection and analysis. ing our attention to a discussion of the concepts expressed by the This approach of theory generation on collected data is well suited survey participants. Finally, in Section 6 we offer our conclusions. for this work since our aim is to model the user’s perception and beliefs about recommender systems. 2 RELATED WORK Previous work at the intersection of recommender systems and Traditional approaches for designing a recommender system in- human-computer interaction has focused mainly on enhancing the clude collaborative filtering, content-based filtering, and model- quality of user experience and interaction with the recommender based methods. Collaborative filtering is often divided into two system. Pu and Chen [29] conducted a user study to show that the separate approaches. User-based collaborative filtering [16] iden- quality of user experience and interaction with the system is crucial tifies similar users to the target user and recommends items that for recommendation performance. They provided a framework those neighbors have consumed. Item-based collaborative filtering named ResQue (Recommender systems’ Quality of user experience) generates recommendations by finding items similar to those pre- to measure a recommender system’s overall perceptive qualities viously selected by the user [32]. In general, these recommenders and effectiveness in influencing users’ behavioral intentions. are based on the idea that users who agreed in the past are likely Kulesza et al. [20] showed how a user’s mental model of the to agree in the future [30]. recommender system can be used to debug and personalize an intelligent agent. In a similar study, the authors explained that 1 www.mturk.com the soundness and completeness of explanations when presenting IntRS Workshop, September, 2019, Copenhagen, Dk Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, Hamed Qahri-Saremi, and Bamshad Mobasher recommendations can impact the end user’s mental model of the extent to which a survey instrument fairly represents the domain recommender system [21]. the instrument seeks to measure, in this case, the domain of online Bonhard and Sasse [8] used a grounded theory methodology to recommender systems. Face validity, on the other hand, assesses demonstrate the impact of similarity and familiarity visualizations. whether the survey is comprehensible to the participants or other People prefer recommendations from people they know. Presenting technically untrained observers. the familiarity and similarity between the user and the people who After receiving feedback the survey instrument was revised. The have rated the items aids them in their decision-making process. feedback resulted in the removal of some questions, the addition Arazy et al. [5] argue for a system that is grounded in theo- of new questions, and the rewording of others. The instrument ries of human behavior. They proposed a methodology to apply was then returned to the panel for more feedback. This iteration theory-driven design to social recommender systems with the aim continued until a consensus was achieved. of improving prediction accuracy. They suggest using behavioral After the survey was approved, it was administered to a pool theories to guide information system design. of participants. Participants were first informed of the purpose of Our work relies on these previous efforts in several ways. First, the survey, any risks, the expected time commitment, and the pay- we must understand the mechanics of recommender systems and ment information. They were then asked if they wish to participate. how users might interpret them. Second, we must use a well- Those that wished to continue were asked basic demographic in- established tool, in this case, qualitative study, to model the user’s formation such as age and gender. At this stage, the system asked understanding of recommender systems. In this pilot study, we do their familiarity with recommender systems to ensure they have not go as far as building a grounded theory; but, present evidence sufficient exposure to complete the survey. The participants were that such a model is feasible and take the first steps toward de- then presented with the open-ended questions based on common veloping it. Third, we take inspiration from the Theory of Mind recommender system scenarios. and previous efforts in social psychology to explore the question of Once the surveys were completed, we evaluated the responses. whether or not users possess a theory of the recommender. Incomplete responses were discarded. The remaining responses were read by three domain experts with a deep understanding of recommender systems. Answers to the open-ended questions 3 METHODOLOGY were coded and organized based on their association with known In this work, we seek to determine if today’s internet users under- recommender system concepts such as similarity functions and stand how recommender systems function. To that end, we present user modeling. Disagreements among the coders were resolved our methodology including the design of our survey instrument, by consensus decisions. Quotes from these responses were then interpretation of the results, and preparation for a larger study. organized for presentation. The initial survey was designed after conducting an extensive In this pilot study, we concluded our analysis at this stage and literature review to identify fundamental aspects of recommender sought to ascertain if there was sufficient evidence to warrant the systems [12]. The questions were based on common scenarios users time and cost of a larger study. In a larger online study, we would often experience when interacting with recommender systems. In continue surveying participants until we observed a saturation of particular, domains for survey questions were inspired from re- concepts. Coders could then develop a grounded theory represent- search by Adomavicius and Tuzhilin on the survey of state of the ing the user’s theory of the recommender. art and possible extensions of recommender systems [3] and the recommender systems handbook [31] by Ricci et al. Adomavicius and Tuzhilin in their survey describes the three popular types of 4 SURVEY recommender systems namely collaborative filtering [18], content- Here we present the survey instrument. First, we presented the based filtering [11], model-based methods [2] and explains the new user with key information. Second, we ask the subject for basic user problem [33]. Similarly, Ricci et al. delineates the basic recom- demographic information such as gender, age, and profession. We mender system ideas and concepts such as user representation [11], then present the subject with five scenarios. These scenarios are item representation [11] and goals of the recommender system [31]. meant to capture everyday situations an internet user might en- Consequently, questions were designed to probe the participant’s counter such as signing up for a new service, visiting a familiar understanding of similar recommender system concepts namely online retailer, or marking a recommended item as irrelevant. user models, item models, similarity measures, the cold start prob- lem, collaborative filtering, and content-based models. We relied on open-ended questions to allow participants to pro- 4.1 Key Information vide in-depth accounts of their experience and understanding of the In order to conform to the standards and practices described by system based on their interactions with the system. To ensure the the Institutional Review Board, we presented the participants with relevance of participant’s answers, we provide them with informa- several pieces of key information. We provided our contact infor- tion about the context of the study (i.e., recommendation systems), mation. We then described the purpose of this research, before while being cautious not to bias their responses. discussing the risk, effort, and remuneration associated with com- The survey instrument was evaluated by a panel of three domain pleting the survey. Privacy issues were discussed and we explicitly experts in recommender systems to establish content validity [9, 23]. stated that no personal information that could be used to identify Similarly, discussions with students helped establish face validity participants would be collected. Participants were then asked if of the survey [23]. In this instance, content validity describes the they wished to continue before being administered the survey. IntRS Workshop, September, 2019, Copenhagen, Dk Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, Hamed Qahri-Saremi, and Bamshad Mobasher 4.2 Demographic Information questions including age and gender along with his prefer- The users were asked to provide demographic information such ence for music (e.g. preference for rock, jazz, and blues). as age, gender, education level and to list services they have used Upon completing the registration, the site recommends sev- in the past to gauge whether a user is qualified to take the survey. eral tracks for Joe to play. Explain with some detail (400 The questions were: characters) how the recommender system made those rec- ommendations for Joe. (1) Please provide your gender. (5) Emily is a user of a popular video streaming website that al- (2) Please select your age group. lows users to create profiles (channels) and upload videos on (3) Please tell us your profession. various topics including sports, music, news, and entertain- (4) Please provide your highest education level. ment. Similarly, users who register on the site can subscribe (5) List up to three applications, websites or services with a to these channels, search, watch, like, comment, and share recommender system that you used in the past 1-6 months. other videos. (6) List up to three recommendations that you received from (a) Sometimes, Emily dislikes her recommended videos and the above stated applications, websites or services. finds them irrelevant. Explain with some detail (400 char- acters) why the recommender system might have recom- 4.3 Recommender System Scenarios mended those items to Emily. The open ended questions first described a common scenario a user (b) What difference in the nature of recommendations would might encounter when interacting with a recommender system have Emily noticed if she used the website as a registered and then asked the user to explain some aspect of the scenario. We user as compared to using the site as a guest? Explain with selected the questions after a literature review identifying common some detail (400 characters). scenarios users often experience when interacting with recom- mender systems. These themes were preference elicitation (Q1), goals (Q2), user modeling (Q3a), familiarity with recommenders (Q3b), Content-based filtering (Q3c), demographic information of a 5 SURVEY RESULTS new user (Q4), implicit behavior, and (Q5a) historic profiling (Q5b). The survey was administered to 25 participants on Amazon Me- The questions were: chanical Turk [26] using Qualtrics XM Platform for surveys [27]. (1) Robin likes watching movies, TV shows, and occasionally Mechanical Turk is a crowdsourcing service that connects ‘workers’ documentaries. His friend recommended him a subscription- and ‘requesters’. Requesters publish ‘human intelligence tasks’, or based video streaming website. Robin registers for the service HITs, and workers complete these tasks online, usually for a small and the website asks Robin to select from a list a few movies, sum. For this HIT, workers were selected based on four criteria. TV shows, and documentaries the ones that he likes. Explain First, their location was limited to the United States in order to with some detail (400 characters) why you think the system avoid language concerns. Second, we limited the participants to asked Robin to select those few items. those that had completed at least 50 hits. Third, we limited the (2) Recommender systems provide personalized recommenda- participants to those that had achieved a 90% acceptance rate on tions to users on a wide variety of platforms such as movies, their previous HITs. Forth, we limited the participants to those that music, travel, news, and products. Based on your experi- had been awarded ‘Master’ status, workers identified by Amazon as ence interacting with the system on these platforms, list maintaining a high standard across a wide range of tasks. These last and explain with some detail (400 characters) four goals and three criteria were enforced to ensure the quality of the responses. intentions of the system. We paid each participant $2.00. (3) Sarah is a regular customer at a popular retail website. The 14 of the participants identified as male, 10 as female, and 1 website sells many types of items but Sarah usually buys preferred not to answer. The ages of the participants were rela- books, electronics and occasionally clothing. She rates her tively uniformly distributed from 18 to 60+, with slightly fewer purchases and leaves feedback on items she bought. Answer participants in the 60+ range. The participants came from a wide the following questions. range of professions including a baker, teacher, editor, engineer and (a) Whenever Sarah logs on to the website. She finds a section business analyst. Nearly all of the participants indicated that they on the web page with a list of recommended items. Explain held a bachelor’s degree or higher. When asked what websites or with some detail (400 characters) what information the services with a recommender system they had recently used, com- recommender system uses to make those recommenda- mon answers included Amazon, eBay, YouTube, Quora, Facebook tions. and Netflix. (b) List a few items the recommender system might recom- In general, analysis of the survey shows that the users possess a mend Sarah in this section. relatively sophistical understanding of how recommender systems (c) Visiting the site today, Sarah is recommended a book by operate. In the remainder of this section, we discuss the recurring an author she is familiar with. Explain with some detail concepts expressed in the survey responses. We then present sev- (400 characters) how you think the system made that rec- eral important limitations of this initial study and how we might ommendation. overcome these limitations in future work. We conclude this sec- (4) Joe likes listening to music and connects to a music stream- tion with a discussion of the extent to which modern internet users ing service as a new user by answering several demographic possess a theory of the recommender. IntRS Workshop, September, 2019, Copenhagen, Dk Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, Hamed Qahri-Saremi, and Bamshad Mobasher 5.1 Identification of Concepts same users consumed them. One user described the importance of Recommender systems are a multifaceted application. In simplest item representation, terms, recommenders attempt to improve the user experience by The system looks at user’s reading habits and trends. steering the users toward items they would prefer and away from It sees that ... the user has read books by this author items they would not. However, there are many strategies em- more frequently in the past. Therefore, it makes sense ployed to accomplish this goal. Moreover, each strategy might have that the system will recommend a book by this author. many sub-strategies. There may also be several external goals for If ... the user tends to read a lot of fiction, and this which the recommender is tuned. Here we attempt to dissect rec- author has written fiction as well as non-fiction, the ommender systems into their fundamental parts and relate them to system probably recommended a work of fiction. the recurring concepts identified in the analysis of the responses to - Male, 61+, Retired the survey instrument. 5.1.3 Collaborative Filtering. Collaborative Filtering models work 5.1.1 User Representation. A fundamental concern when designing on the assumption that users who had comparable preferences in a recommender system is how to represent the users. In some the past are likely to have comparable preferences in the future. systems users are represented as a vector space representation over Based on this idea, collaborative filtering identifies ‘similar’ users the item space; that is to say they are represented by the collection to a given target user based on similarity of ratings or items con- of items they have consumed, purchased or rated in the past. In our sumed. The system then recommends items rated by those similar survey, we found several examples of participants describing this users [16]. type of user representation. The survey responses demonstrated users having similar per- One user described, “user information” explaining that the sys- ceptions of the topic. One user described “... Recommendations are tem uses ratings and reviews provided by the user. Another user based on what the user ... has purchased before, how he rated it, stated that the application uses “history” to build a list based on what he browses and what other people who ... bought the same the items browsed or consumed by the user in the past. Perhaps things as ... the user tend to like.” Similarly, another said, the best example was, The system will use customer’s preferences ... to match The information that is used to make recommenda- with those of other users of that streaming service. If tions are primarily things that the users have input other users have the same or similar preferences ..., into the site themselves. Their page views, prior pur- there’s a good chance that the user will like the same chases, reviews, likes and dislikes are all taken into content as those other users. This allows the stream- account. The system will mainly show the user things ing service to provide relevant recommendations. If that it thinks they will like or want to get more in- the system has enough data from a large number of formation about - usually similar items to what they users, these predictions can be fairly accurate. have bought in the past. - Male, 21-30 - Male, 41-50, Economist 5.1.4 Content-Based Filtering. Content-based recommender sys- In other systems users are represented by their demographic tem builds a profile for each user based on past preferences and information such as sex, age, or geographic location. Such an ap- interactions. In general, the system makes recommendations by proach works based on the notion that users with similar demo- comparing features of items to the user profile. graphics have similar preferences. The system then uses this infor- We have noticed participants describing the same idea in several mation to classify users into pre-existing groups. different ways. One user mentioned, “similar content” stating that Similar to the above concept, participants recognized that the the system recommends similar items to those recently purchased system uses demographic data to recommend items. For example, or clicked. Another user inferred that the service likely uses features one user described, “The system looks at user’s age and gender and of items bought in the past explaining “... if a user is looking at then compares it to what other users in those demographics tend books, the system will note the genre and recommend based on to prefer”. Another user answered, that.” One of the best examples was, Based on ... [the] user’s identified demographics, the I think the system is looking for a pattern in ... [the] system can build an interest profile. The demograph- user’s selection of movies. For example, if the user ics will help to narrow the scope, as a particular decade selects some movies with a particular actor, the system may be more likely to be of interest to the user based is likely to recommend other movies to the user with on their age. that actor. For example, let’s say a person listed a - Male, 31-40, Business Analyst movie with Jack Nicholson in it and it was a thriller and of his earlier movies. This alerts the system that 5.1.2 Item Representation. Like user representation, item repre- the user probably is more interested in Nicholson’s sentation is a critical aspect of many recommender systems. For earlier movies than more recent movies, and may very instance, in a movie application, a movie might be represented by well not be a fan of his comedies. a set of features including the genre, actors, and directors. - Male, 61+, Retired The survey participants appear to understand the importance of item representation citing specific attributes such as “genre”, 5.1.5 Model-Based Approach. Some of the most powerful recom- “actors” and “authors”. Users describe two item being similar if the mender systems are model-based approaches such as those based IntRS Workshop, September, 2019, Copenhagen, Dk Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, Hamed Qahri-Saremi, and Bamshad Mobasher on singular value decomposition. These techniques are often more Recommender systems provide recommendations with complex than collaborative filtering or content-based approaches. a goal to satisfy the customer. It helps make deci- As such, we did not find that the users we able to describe model- sions for the customer. I think another goal is for the based approaches in as much detail as the simpler ones. There were, company to make money. Buying recommendations however, a few examples. equals profit. Another goal would be to increase user One user stated, “The goal is to build as robust of a profile as loyalty. And last, the goal of simply bettering the sys- possible for users so that it may inspire further purchases.” Another tem overall. user expressed, “... The system is ... gathering information about me - Male, 21-30, Sales Representative in order to create a profile and refine its recommendations that can 5.1.8 Attitude. Attitude is a user’s overall feeling towards a rec- be used to determine the types of ads and marketing that I would ommender system. Users possess different impressions about the be attracted to, and perhaps respond to.” system based on their interactions. Examples of positive attitudes The algorithm can see what ... users like and deter- include satisfaction, confidence, and trust. Similarly, a few users mine, from there, what else ... they might be interested find the system invading their privacy or too onerous. in. I do not know the specifics of how they work, When asked about this aspect, one participant answered, “The though, so I can’t really be that detailed. system ensures the user has a positive and useful experience with - Female, 31-40, Editor the service.” On the other hand, another participant revealed, The ... user might find that she has generally better It seems this participant understands the goals of the system recommendations but is losing more of her privacy and knows that recommendations are based on historical prefer- as a result. They ... would likely find their preferences ences. However, she admits to not knowing the details of how these appear in the form of ads not just on this particular systems make recommendations. site, but other sites related to or sites that are part of the same conglomerate. 5.1.6 Association Rule Mining. Association rule mining is another - Male, 21-30, Civil Engineer useful machine learning technique that helps gain insights into the user’s buying habits. Association rules capture relationships among 5.1.9 Cold Start Problem. A crucial challenge for any recommender items based on patterns of co-occurrence across transactions [4]. system is how to recommend items to new users. For a new user, Amazon, for example, uses association rule when it presents a list the system lacks the valuable history of the user’s interaction with of items to the user and states,“Customers who viewed this item the system on which to base the recommendations [33]. Similarly, also viewed these items.” new items with few ratings becomes difficult to recommend. While most users described the idea of recommended items being Participants who took our survey conveyed similar interpreta- similar, one user hinted towards the idea of associated products. tions of the concept. One user discussed, “initial recommendations” The user expressed that she is usually recommended gadgets that recognizing that they might differ from those the user would re- work with the electronics she previously purchased. Another user ceive later on. Another presumed that the system could overcome described, a lack of knowledge about the user by using an “enrollment survey” to get a basic idea of their preferences. A third mentioned the use of The system may recommend books in similar genres “user demographics” to recommend items preferred by those within ... the user reads. [The system may recommend] elec- the same age group or location. A user expressed: tronics that are accessories or work with [items] ... The systems use ... past purchase history to make rec- the user has already bought. ommendations, since it’s a clear indication of what the - Female, 21-30, Graphic designer user likes to purchase...The suggestions gain strength in accuracy the more a user searches on the site and 5.1.7 Goals. The fundamental goal of a recommender system is to importantly the suggestions will be very weak or non- provide users with relevant information. While identifying relevant existent with a new user. However, if the systems use information is integral to these systems, recommenders also pos- their demographics (as provided when she signed up sess several other motivations that may differ from the viewpoint on the site), the system can provide recommendations of consumers and providers. Goals of the recommender system based on her age, sex, and location. from a consumer’s viewpoint include helping users explore the - Female, 41-50, Transcriptionist product space, actively notifying users of relevant content, and pro- viding a satisfying experience. At the same time, from the provider’s 5.1.10 Diversity. Diversity in a recommendation list is the vari- viewpoint, the recommender system’s goals may be to steer user ation in items being presented to the users [34]. Often there is behavior in the desired direction, increase revenue, learn consumer inherent uncertainty in user’s interests, therefore, recommending a habits, and maintain customer loyalty [17]. variety of items may preserve user interests and avoid disappoint- Participants in our survey reflected a similar understanding of ment. the goals and motivations of the recommender system. One partici- We witnessed only a few participants describe the notion of pant expressed, “The goal of the system is to ensure the user has diversity with imprecise details in the survey. One user mentioned, a positive and useful experience with the service.” Another user The system looks at the item that the user searches wrote, and comes up with similar items. The system also IntRS Workshop, September, 2019, Copenhagen, Dk Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, Hamed Qahri-Saremi, and Bamshad Mobasher looks at the things that she has bought in the past this was enough to provide subjective evidence that many users and decide to put other items that are close to those maintain a mental model of the recommender system, we cannot on a recommended list in order to give her more of a claim that we have collected enough responses to have identified choice to choose from. all the ways in which users understand recommender systems. -Male, 21-30, Engineer A third limitation stems from the nature of online surveys. They are inflexible. While we were able to ask nearly any question we 5.1.11 Serendipity. Serendipity is another characteristic of items would like, we could not improve questions based on previous in a recommendation list that have shown to improve user’s overall responses. impression of the service. The important aspects of serendipity are A forth limitation also stemming from the nature of online sur- for an item to be relevant, novel, and unexpected [25]. veys is the lack of depth. The survey questions are identical for each Though inaccurate, survey participants expressed the ideas of participant regardless of their experience, background or education. “new, interesting and relevant” frequently throughout the survey. We were unable to adapt questions or explore answers based on the One user stated, “The application attempts to provide value, enter- interaction with the participants as we might have in an in-person tainment and interesting content”. Another one said, “I believe that interview. the system has ... some form of a baseline understanding of [the Despite these limitations, this pilot study provides strong sub- user’s] interests so that it can make relevant recommendations.” A jective evidence that users possess a cognitive model of how rec- third stated, ommender systems work. Next, we discuss future work to further The system is trying to expand ... user engagement explore this research direction while addressing the limitations. with the system, by suggesting new topics for ... them to watch, they may reveal a new set of videos that she will watch on the platform. 5.3 Future Work - Male, 31-40, Business Analyst This pilot study was conducted to assess the feasibility, time and cost of a larger more in-depth survey and to improve upon the 5.1.12 Context. Context is any information that can be used to study design. Here we discuss our plans for future work both in describe the situation of an entity; information such as location, the short term and the long term. time, and season [1] . Users tend to have different preferences under In the short term, we plan to conduct a larger online survey. This different circumstances. For example, in a movie recommender, a survey will take an iterative process. Participants will be selected user may prefer a different genre of the movie based on his com- to complete the survey. Their responses will be coded by multi- panion. Incorporating such information in the recommendation ple coders, identifying key recurring concepts expressed by the process helps personalize recommendations that are relevant to a participants. Another batch of participants will then be selected user’s specific context. to complete the survey. This process will continue until we reach A few participants demonstrated an understanding of the use of saturation of key concepts. Agreement between the coders would contextual information in recommendations. A user answered, be evaluated to compute the inter-coder reliability [22]. If the system sees that I tend to make purchases more We seek to develop a theory of the user’s knowledge and per- often in the summer than other [seasons], it may rec- ceptions rather than test any preconceived hypothesis. To this end, ommend more items related to vacations, summer we will use a grounded theory methodology to develop a rich and clothing, etc. detailed theoretical account of the user’s understanding that is - Male, 60+, Retired purely grounded in observations of their knowledge and personal experience. Consistent with the grounded theory approach, data 5.2 Limitations collection and analysis will be conducted simultaneously allowing This study is the first in a larger research agenda. Being the first emerging concepts to guide the process of further data collection. foray into the study of how users understand recommender systems, Several coding schemes will be used to identify emerging concepts it does suffer from several limitations. We discuss some of those and relationships, and unify them to formulate our ‘theory of the limitations here and plan to overcome them in the future work. recommender’. Finally, we will establish the trustworthiness of our This survey has two noticeable biases. First, subjects were re- finding by using an inter-rater reliability test. cruited from Amazon’s Mechanical Turk. These individuals are To obtain a better outlook on the user’s perceptual world of likely more technologically savvy than the average internet user. the recommender system, we will also use in-person interviews Second, subjects self selected to complete the survey. The Mechan- as our method of data collection alongside the online survey. This ical Turk volunteers opted to complete this survey after viewing will allow us to ask more detailed questions and delve deeper into the title, “Tell us Your Experience with Recommender Systems.” unique ideas that arise during the interview process. Workers without an interest in recommender systems may have In the long term, we are interested in how users change their opted not to complete the survey thereby skewing the sample pop- behavior based upon their cognitive model of how recommender ulations. In retrospect, the strict criteria we placed on the worker’s systems function. Users, sensitive to privacy concerns, may forgo qualifications – number of completed HITs, HIT acceptance rate, the benefits of a recommender system and opt to view news sto- etc. – may have exacerbated these biases. ries in ‘incognito’ mode. Others, when presented with a political Another limitation of the study is the number of participants. viewpoint with which they disagree, may aggressively downvote a Only 25 Mechanical Turk workers completed the survey. While content creator in a video sharing service in order to signal to the IntRS Workshop, September, 2019, Copenhagen, Dk Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, Hamed Qahri-Saremi, and Bamshad Mobasher system that they are uninterested in those viewpoints. On the other The user is also able to infer the perceptions of a recommender. hand, a user unhappy with the current set of recommendations Users clearly understand that their ratings, demographics, and click- may purposely search for and upvote items they have previously throughs are observed by the system. Some users even understand enjoyed in order to improve their user profile. that they can manipulate the output of the recommender by chang- Anecdotal evidence suggests that, in recent years, these behav- ing the information they allow the recommender to perceive. iors have become more commonplace. We must then ask: what is The user may infer what knowledge the recommender captures. the impact of these behaviors on the health of the recommender This knowledge can include how the system represents users or system? We plan to perform experimental work on historical and items. It can include metadata about the items. It may even capture synthetic data to understand the impact of these behaviors in a the context of the user as the recommendations are being made. variety of domains. Finally, the user can make assumptions about what the recom- Finally, we imagine a framework that can automatically identify mender believes. A fundamental belief of collaborative filtering is and exploit the signals arising from these behaviors in order to that users who have agreed in the past are likely to agree in the fu- improve the user experience. If a user assigns one star to a recom- ture. Content-based recommenders, on the other hand, believe that mended movie, does it mean that the user truly dislikes it or simply as a user consumes a product, she is also labeling herself with the that he does not want it taking up space in his recommendation characteristics of the item. Recommenders relying on association queue? If a user uncharacteristically spends an online session rating rule mining believe that consumption patterns observed in the past several new songs, is this an indication that she is looking for more are relevant for the present. Users, experienced with recommender variety in her recommendations? In sum, the motivation of this systems, have incorporated these beliefs into their mental model research agenda is to understand the user’s understanding of how of recommender systems. All these beliefs were identified by the recommender systems function, observe what behaviors that un- survey participants. derstanding manifests, and engineer recommender systems to take Implications of a user’s theory of the recommender are many. advantage of these behaviors to improve the system’s performance. Even though this work presents compelling evidence that users possess such a model, this pilot study is only the first step in for- malizing it. How user behavior is informed by their mental model of the recommender and how recommender systems can adapt to 5.4 Discussion these behaviors remains an important research direction. The participants of the online survey instrument have developed a cognitive model of how recommender systems function going far beyond a primitive understanding of inputs and outputs. They 6 CONCLUSION have developed a theory of the recommender. Just as the theory of In this paper, we asked the question: Does the User have a Theory mind describes an individual’s ability to attribute goals, perceptions, of the Recommender. To answer this question, we developed a sur- knowledge, and beliefs to others, the theory of the recommender vey instrument to elicit a subject’s understanding of recommender describes the ability of users to attribute these qualities to the rec- systems based on several scenarios. This survey was given to 25 par- ommender system. We have purposely adopted the theory of mind ticipants. An exhaustive analysis of their responses demonstrates as an exemplar because in many ways the interaction a user has that the participants possess a keen understanding of many of the with a recommender system mimics more closely their interaction recommenders’ basic algorithms and design goals. For example, with other human beings than with other online applications. many users seem to understand that recommenders often keep It is common for a system to query the user’s interests during track of past behavior, identify similar users, leverage metadata, a registration process. It may then suggest items to the user. The and seek to provide relevant and diverse recommendation. user may then consume the item, rate it, write a review of it, or This paper is the first step of a larger research agenda. Future even ignore it. Recommender systems exploit this feedback to make research milestones include conducting a larger online survey until new recommendations which the user can then view. This cyclical we reach a saturation of key concepts, constructing a grounded the- interaction can be likened to a conversation with the recommender ory from these key concepts, and conducting in-person interviews system. Word processors, online shopping carts, and wikis do not to verify and improve upon the grounded theory. Later, we plan to share this form of interaction. evaluate how users modify their behavior based on their cognitive Such an interaction directly impacts the relationship the user model of the system, what impact this behavior might have on the has with the recommender. The user can witness, sometimes imme- recommender, and how a recommender system can identify and diately, the result of liking or disliking an item. The user can predict leverage that behavior. what would happen if they read news stories about a particular city or event. The user can reason why a recommender system is promoting a new song. The rich and ubiquitous interactions users have with recommender systems enables the users to refine over ACKNOWLEDGMENTS time a cognitive model of how they function. We would like to thank the participants who completed the survey. Often the user is able to interpret the goals of the system. We ob- We would also like to thank members of the Institutional Review served in the survey responses several examples of perceived goals Board who assisted in the design and approval of the survey instru- including 1) satisfying the user’s interests, 2) aiding the decision- ment administered to human subjects. This work was supported in making process, 3) increasing loyalty, and 4) maximizing profit. part by two PhD Scholarship Awards granted by DePaul University. IntRS Workshop, September, 2019, Copenhagen, Dk Muheeb Faizan Ghori, Arman Dehpanah, Jonathan Gemmell, Hamed Qahri-Saremi, and Bamshad Mobasher REFERENCES [29] Pearl Pu, Li Chen, and Rong Hu. 2011. A user-centric evaluation framework for [1] Gregory D Abowd, Anind K Dey, Peter J Brown, Nigel Davies, Mark Smith, and recommender systems. In Proceedings of the fifth ACM conference on Recommender Pete Steggles. 1999. Towards a better understanding of context and context- systems. ACM, 157–164. awareness. In International symposium on handheld and ubiquitous computing. [30] Paul Resnick, Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom, and John Springer, 304–307. Riedl. 1994. GroupLens: an open architecture for collaborative filtering of netnews. [2] P. H. Aditya, I. Budi, and Q. Munajat. 2016. A comparative analysis of memory- In Proceedings of the 1994 ACM conference on Computer supported cooperative based and model-based collaborative filtering on the implementation of rec- work. ACM, 175–186. ommender system for E-commerce in Indonesia: A case study PT X. In 2016 [31] Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to rec- International Conference on Advanced Computer Science and Information Systems ommender systems handbook. In Recommender systems handbook. Springer, (ICACSIS). 303–308. https://doi.org/10.1109/ICACSIS.2016.7872755 1–35. [3] Gediminas Adomavicius and Alexander Tuzhilin. 2005. Toward the next gen- [32] Badrul Munir Sarwar, George Karypis, Joseph A Konstan, John Riedl, et al. 2001. eration of recommender systems: A survey of the state-of-the-art and possible Item-based collaborative filtering recommendation algorithms. Www 1 (2001), extensions. IEEE Transactions on Knowledge & Data Engineering 6 (2005), 734–749. 285–295. [4] Rakesh Agrawal, Tomasz Imieliński, and Arun Swami. 1993. Mining association [33] Andrew I Schein, Alexandrin Popescul, Lyle H Ungar, and David M Pennock. rules between sets of items in large databases. In Acm sigmod record, Vol. 22. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the ACM, 207–216. 25th annual international ACM SIGIR conference on Research and development in [5] Ofer Arazy, Nanda Kumar, and Bracha Shapira. 2010. A theory-driven design information retrieval. ACM, 253–260. framework for social recommender systems. Journal of the Association for Infor- [34] Cai-Nicolas Ziegler, Sean M McNee, Joseph A Konstan, and Georg Lausen. 2005. mation Systems 11, 9 (2010), 455. Improving recommendation lists through topic diversification. In Proceedings of [6] Marko Balabanović and Yoav Shoham. 1997. Fab: content-based, collaborative the 14th international conference on World Wide Web. ACM, 22–32. recommendation. Commun. ACM 40, 3 (1997), 66–72. [7] Hennie Boeije. 2002. A purposeful approach to the constant comparative method in the analysis of qualitative interviews. Quality and quantity 36, 4 (2002), 391– 409. [8] P Bonhard and MA Sasse. 2006. ‘Knowing me, knowing you’-using profiles and social networking to improve recommender systems. BT Technology Journal 24, 3 (2006), 84. [9] M-C Boudreau, D Gefen, and DW Straub. 2001. Validation in information systems research. MIS quarterly 25, 1 (2001), 1–14. [10] Juliet M Corbin and Anselm Strauss. 1990. Grounded theory research: Procedures, canons, and evaluative criteria. Qualitative sociology 13, 1 (1990), 3–21. [11] Marco De Gemmis, Pasquale Lops, Cataldo Musto, Fedelucio Narducci, and Giovanni Semeraro. 2015. Semantics-aware content-based recommender systems. In Recommender Systems Handbook. Springer, 119–159. [12] David De Vaus and David de Vaus. 2013. Surveys in social research. Routledge. [13] Chris Frith and Uta Frith. 2005. Theory of mind. Current Biology 15, 17 (2005), R644–R645. [14] Barney G Glaser and Anselm L Strauss. 2017. Discovery of grounded theory: Strategies for qualitative research. Routledge. [15] Alvin I Goldman et al. 2012. Theory of mind. The Oxford handbook of philosophy of cognitive science (2012), 402–424. [16] Jonathan L Herlocker, Joseph A Konstan, Loren G Terveen, and John T Riedl. 2004. Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS) 22, 1 (2004), 5–53. [17] Dietmar Jannach and Gediminas Adomavicius. 2016. Recommendations with a Purpose. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 7–10. [18] Yehuda Koren and Robert Bell. 2015. Advances in collaborative filtering. In Recommender systems handbook. Springer, 77–118. [19] Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization tech- niques for recommender systems. Computer 8 (2009), 30–37. [20] Todd Kulesza, Simone Stumpf, Margaret Burnett, and Irwin Kwan. 2012. Tell me more?: the effects of mental model soundness on personalizing an intelligent agent. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1–10. [21] Todd Kulesza, Simone Stumpf, Margaret Burnett, Sherry Yang, Irwin Kwan, and Weng-Keen Wong. 2013. Too much, too little, or just right? Ways explanations impact end users’ mental models. In 2013 IEEE Symposium on Visual Languages and Human Centric Computing. IEEE, 3–10. [22] Karen S Kurasaki. 2000. Intercoder reliability for validating conclusions drawn from open-ended interview data. Field methods 12, 3 (2000), 179–194. [23] Mark S Litwin. 1995. How to measure survey reliability and validity. Vol. 7. Sage. [24] Pasquale Lops, Marco De Gemmis, and Giovanni Semeraro. 2011. Content-based recommender systems: State of the art and trends. In Recommender systems handbook. Springer, 73–105. [25] Sean M McNee, John Riedl, and Joseph A Konstan. 2006. Being accurate is not enough: how accuracy metrics have hurt recommender systems. In CHI’06 extended abstracts on Human factors in computing systems. ACM, 1097–1101. [26] Gabriele Paolacci, Jesse Chandler, and Panagiotis G Ipeirotis. 2010. Running experiments on amazon mechanical turk. Judgment and Decision making 5, 5 (2010), 411–419. [27] Eyal Peer, Gabriele Paolacci, Jesse Chandler, and Pam Mueller. 2012. Selectively recruiting participants from Amazon Mechanical Turk using qualtrics. Available at SSRN 2100631 (2012). [28] David Premack and Guy Woodruff. 1978. Does the chimpanzee have a theory of mind? Behavioral and brain sciences 1, 4 (1978), 515–526.