Testing a Recommender System for
                    Self-Actualization

                    Daricia Wilkinson, Saadhika Sivakumar,
                     Pratitee Sinha, Bart P. Knijnenburg

                              Clemson University
                                Clemson, USA
      dariciw@clemson.edu, ssivaku@g.clemson.edu, psinha@g.clemson.edu,
                              bartk@clemson.edu


      Abstract. Traditionally, recommender systems were built with the goal
      of aiding users’ decision-making process by extrapolating what they like
      and what they have done to predict what they want next. However, in
      attempting to personalize the suggestions to users’ preferences, these
      systems create an isolated universe of information for each user, which
      may limit their perspectives and promote complacency. In this paper,
      we describe our research plan to test a novel approach to recommender
      systems that goes beyond “good recommendations” that supports user
      aspirations and exploration.

      Keywords: Recommender Systems; Filter Bubble; Choice Overload;
      Self-Actualization.


1   Introduction

Recommender systems have become ubiquitous in daily user interactions across
many e-commerce websites, social networking sites and streaming services. The
main purpose of these systems is to provide users with relevant information, and
as such, much of the content consumed online is personally tailored [17].
    Although personalized content has numerous benefits, presenting items that
are only based on some of users’ expressed preferences could hinder the effective-
ness of the recommender, and trap users in a “filter bubble” [14] that limits their
perspectives, discourages exploration and prevents genuine taste development.
    Recently, scholars have acknowledged the importance of other user-centered
factors beyond accuracy that contribute to the effectiveness of recommender sys-
tems [6, 12]. This shift “beyond the accuracy” has spawned investigations into
solutions that improve all aspects of the user interaction experience. For instance,
to increase understandability, researchers have suggested providing explanations
of the recommendations [4]. However, these explanations could increase the al-
ready high conformity, as users simply trust the system’s explanation rather
than engaging in true understanding and exploration [5].
4       D. Wilkinson et al.

    This could have long-term societal consequences as the persuasive nature of
recommendations could replace human creativity and understanding [13], turn-
ing humans into “input” for systems rather than acknowledging the opportuni-
ties for taste development. This paper outlines our research plan to build upon
our recent proposal for recommender systems for self-actualization- systems that
helps users in understanding their unique tastes through development and ex-
ploration [9].


2    Algorithmic Features
Previous research on critiquing [2, 3, 15, 16] and diversifying recommendations [20]
has already investigated better options for the Top-N suggestions, yet the focus of
these alternative methods is to provide “good recommendations”. However, our
approach is fundamentally different, since it carefully considers the psychology
of consumer choice processes, and supports (rather than replaces) these processes
by featuring new recommendation lists. In addition to displaying a Top-N list,
our system also differs from previous studies by simultaneously displaying the
following lists that promote exploration and taste development (for more details
see [9]): Our alternative lists will address the following issues:

Incorrect negative predictions. In conventional recommenders, items that a sys-
tem predicts that you may dislike are never shown. While they are mostly correct,
it is possible for the system to be mistaken on some. These mistakes are hard
to correct, because items with low-valued predictions are never recommended.
Presenting users with a list of things we think you’ll hate will allow users to
correct or confirm low-valued predictions.
     Our list of things we think you’ll hate contains items that have a low predicted
rating for this user, compared to the average predicted rating. To populate this
list, compute the difference between the total average rating of the item and the
user predicted rating, with the following formula:

       items = max(average predicted rating − user predicted rating)

    This allows users to correct mistakes quickly.

Unknown preferences. It is difficult for recommenders to predict items for which
there is insufficient information about whether the user will like them or not.
As a result, recommender usually tailor to users’ known preferences only. Rather
than only catering to known preferences, we propose to display a list of things
we have no clue about. This list consists of items with a user predicted rating for
which the system has the lowest confidence. Current recommender algorithms
do not provide confidence intervals, so for our study we estimate the system’s
confidence by computing the difference in user-predicted ratings for different
algorithms, e.g. matrix factorization (mf) and k-nearest neighbors (knn):

           items = max(predicted ratingmf − predicted ratingknn )2
                       Testing a Recommender System for Self-Actualization         5

   This allows the system to learn information about all of a user’s preferences,
rather than just a subset of their preferences.

Novel items. New items are an enduring complication in recommender systems.
Since users have yet to try them, they rarely show up among the recommenda-
tions. Most recommender systems solve this cold start problem through content
based techniques to approximate predicted ratings. However, this solution ig-
nores the fact that user may at time actually be excited to try new things, even
if it does not always fit their preferences [18]. We propose to resolve the cold
start problem by simply presenting items with limited rating data to users who
are excited to try them. These “hipster” users are likely to appreciate things
you’ll be among the first to try, and their feedback on these items will help to
populate the available information and hence, improve the system. We detect
these “hipster” users by detecting their high percentage of top-rated items with
very few ratings, and then show them more of such items.

        users = max(% top rated items with (#ratings < threshold))


                             items = min(#ratings)

Controversial items. Recommenders usually identify a set of users that are simi-
lar to the current user, and then calculate recommendations based on the prefer-
ences of these nearest neighbors. This often leads to recommendations that the
neighbors unanimously like. These “safe” recommendations do not challenge a
user’s tastes beyond what is generally agreed upon as “good” among like-minded
users. However, these neighbors may not always be an unvaried group, and there
may be certain polarizing items that some of them really like, but others hate.
Our fourth feature will detect these polarizing items. This list of things that are
controversial can help users to develop their unique tastes.
    Among the four proposed features, identifying polarizing items is arguably
the most challenging from an algorithmic perspective. The simplest approach to
identify polarizing items is to select items that have the highest rating variability
or range (rather than average) among the neighbors:

                items = max(var(neighbors0 predicted ratings))

    A more sophisticated approach would be to cluster the identified neighbors
based on their ratings, and then select items that best discriminate between
clusters.
    The proposed features could improve recommenders? ability to support peo-
ple in life-altering decisions (e.g. choosing an education, a job, an insurance plan,
or a retirement fund) where it is important that they develop a strong sense of
determination about the chosen path. The features would also improve recom-
menders? ability help people make lifestyle choices (e.g., about music, movies, or
fashion) based on carefully developed personal tastes. Our proposed plan to test
and evaluate these features in a movie recommender system are outlined below.
6        D. Wilkinson et al.


Fig. 1. Mockup of the experiment showing the Top-10 list on the left and “Things you
may hate” condition on the right. The list on the right will differ for each of the five
conditions and it will be manipulated between-subjects.


3      Research Plan
The goal of our research is to develop, test and evaluate a recommender sys-
tem that supports rather replaces the decision-making process for users. Our
proposed features that were mentioned in the previous section will be tested
alongside a traditional Top-N recommender. The system will have the capa-
bility to display a Top-N recommendation list, as well as the lists of the four
new features. We will train the system using the MovieLens dataset. An online
experiment will be conducted to test the RSSA features.

3.1     Online Experiment
The experiment will be conducted on Amazon Mechanical Turk (MTurk) with at
least 300 participants. Including the traditional “Top-N only” recommendations,
the experiment will also test the four RSSA features in combination with a Top-N
list (see Fig. 1).
    In our study, participants will see two lists of 10 items: one list will be the
traditional Top-10 (“Things you might like”), while the other list will be manip-
ulated between-subjects with the following five conditions:

    – “More things you might like”; i.e. the next 10 recommendation (Top-11-20)
    – “Things we think you will hate”
    – “Things we are not sure about”
                       Testing a Recommender System for Self-Actualization           7

 – “Things you’ll be among the first to try”
 – “Things that are controversial”

    After being randomly assigned to one of the five experimental conditions,
participants will be asked rate 15 movies that they have seen before, to use as
a base for their recommendations. Next, we will show them the recommenda-
tions the Top-10 list of “things you might like” on the left, while the list on the
right will feature 10 items that are based on the randomly selected experimental
condition. At this point, we will ask participants to rate the movies from the
two lists. After this final round of rating we will update the two lists, and ask
participants to select one movie that they would watch right now. Finally, par-
ticipants will be asked to complete a questionnaire to evaluate their experience
with using the system. The behavioral and objective aspects to be evaluated
are outlined in Table 1. We will adopt highly validated questionnaire items from
previous studies [1, 7, 8, 10, 11, 19] and develop additional scales along the lines of
the Knijnenburg et al. user experience framework for recommender systems [10]
which will contribute to the theory of recommender systems evaluation.


Aspect                          Description
Questionnaire (Q),
Behavior (B)
Perceived     Recommendation Existing scales [1, 7, 8, 10, 19]
quality, diversity, novelty (Q)
System and choice               Existing scales [1, 7, 8, 10, 11, 19]
satisfaction (Q)
Choice and tradeoff             Existing scales [1, 19]
difficulty (Q)
Perceived taste                 Whether users think the system is
coverage (Q)                    able to cover all of their tastes
Objective coverage              Average number of different items that are recom-
(B)                             mended to each user over the course of the experiment
Fear of missing                 Average number of different items that are recom-
things (Q)                      mended to each user over the course of the experiment
Taste clarification             Whether users think the system helps them understand
potential (Q)                   their own tastes
Taste development               Whether users think the system helps them develop
potential (Q)                   their own tastes
Perceived choice                Whether users think they are consuming similar things
conformity (Q)                  like everyone else
Objective choice                Average cosine similarity between users’ consumption
conformity (Q)                  patterns

           Table 1. User experience aspects measured in the experiment.
8       D. Wilkinson et al.

4    Conclusion

In this paper, we describe a new direction for recommender systems that move
towards supporting our aspirational selves rather than pushing content to users
based on their history. We outline our research plan for developing and evaluating
the new interface and algorithmic features. Aside from this, we are also working
with several companies and organizations to build these new features into real-life
recommenders. We believe that our Recommender Systems for Self-Actualization
acknowledges the multidimensionality and evolving nature of human beings, and
can fundamentally change the way recommender systems are used.


Acknowledgments

This research was supported in part by the NSF award IIS 1565809.


References
 1. D. Bollen, B. P. Knijnenburg, M. C. Willemsen, and M. Graus. Understanding
    choice overload in recommender systems. In Proceedings of the Fourth ACM Con-
    ference on Recommender Systems, RecSys ’10, pages 63–70, New York, NY, USA,
    2010. ACM.
 2. L. Chen and P. Pu. Interaction design guidelines on critiquing-based recommender
    systems. User Modeling and User-Adapted Interaction, 19(3):167–206, Aug. 2009.
 3. L. Chen and P. Pu. Critiquing-based recommenders: Survey and emerging trends.
    User Modeling and User-Adapted Interaction, 22(1-2):125–150, Apr. 2012.
 4. G. Friedrich and M. Zanker. A taxonomy for generating explanations in recom-
    mender systems. AI Magazine, 32(3):90–98, 2011.
 5. F. Gedikli, D. Jannach, and M. Ge. How should i explain? a comparison of dif-
    ferent explanation types for recommender systems. Int. J. Hum.-Comput. Stud.,
    72(4):367–382, Apr. 2014.
 6. C. He, D. Parra, and K. Verbert. Interactive recommender systems: A survey of
    the state of the art and future research challenges and opportunities. Expert Syst.
    Appl., 56:9–27, 2016.
 7. B. P. Knijnenburg, S. Bostandjiev, J. O’Donovan, and A. Kobsa. Inspectability
    and control in social recommenders. In Proceedings of the Sixth ACM Conference
    on Recommender Systems, RecSys ’12, pages 43–50, New York, NY, USA, 2012.
    ACM.
 8. B. P. Knijnenburg, N. J. Reijmer, and M. C. Willemsen. Each to his own: How
    different users call for different interaction methods in recommender systems. In
    Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys ’11,
    pages 141–148, New York, NY, USA, 2011. ACM.
 9. B. P. Knijnenburg, S. Sivakumar, and D. Wilkinson. Recommender systems for
    self-actualization. In Proceedings of the 10th ACM Conference on Recommender
    Systems, RecSys ’16, pages 11–14, New York, NY, USA, 2016. ACM.
10. B. P. Knijnenburg, M. C. Willemsen, Z. Gantner, H. Soncu, and C. Newell. Explain-
    ing the user experience of recommender systems. User Modeling and User-Adapted
    Interaction, 22(4-5):441–504, Oct. 2012.
                        Testing a Recommender System for Self-Actualization           9

11. A. Kobsa, B. P. Knijnenburg, and B. Livshits. Let’s do it at my place instead?:
    Attitudinal and behavioral study of privacy in client-side personalization. In Pro-
    ceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI
    ’14, pages 81–90, New York, NY, USA, 2014. ACM.
12. J. A. Konstan and J. Riedl. Recommender systems: From algorithms to user
    experience. User Modeling and User-Adapted Interaction, 22(1-2):101–123, Apr.
    2012.
13. J. Lanier. You Are Not a Gadget: A Manifesto. Thorndike Press, 2010.
14. E. Pariser. The Filter Bubble: How the New Personalized Web Is Changing What
    We Read and How We Think. Penguin Books, New York, NY, USA, 2012.
15. P. Pu, B. Faltings, L. Chen, J. Zhang, and P. Viappiani. Usability guidelines
    for product recommenders based on example critiquing research. In F. Ricci,
    L. Rokach, B. Shapira, and P. B. Kantor, editors, Recommender Systems Handbook,
    pages 511–545. Springer, 2011.
16. P. Resnick, R. K. Garrett, T. Kriplean, S. A. Munson, and N. J. Stroud. Bursting
    your (filter) bubble: Strategies for promoting diverse exposure. In Proceedings
    of the 2013 Conference on Computer Supported Cooperative Work Companion,
    CSCW ’13, pages 95–100, New York, NY, USA, 2013. ACM.
17. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: An
    open architecture for collaborative filtering of netnews. In Proceedings of the 1994
    ACM Conference on Computer Supported Cooperative Work, CSCW ’94, pages
    175–186, New York, NY, USA, 1994. ACM.
18. A. I. Schein, A. Popescul, L. H. Ungar, and D. M. Pennock. Methods and metrics
    for cold-start recommendations. In Proceedings of the 25th Annual International
    ACM SIGIR Conference on Research and Development in Information Retrieval,
    SIGIR ’02, pages 253–260, New York, NY, USA, 2002. ACM.
19. M. C. Willemsen, M. P. Graus, and B. P. Knijnenburg. Understanding the role of
    latent feature diversification on choice difficulty and satisfaction. User Modeling
    and User-Adapted Interaction, 26(4):347–389, Oct. 2016.
20. C.-N. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen. Improving recommen-
    dation lists through topic diversification. In Proceedings of the 14th International
    Conference on World Wide Web, WWW ’05, pages 22–32, New York, NY, USA,
    2005. ACM.