=Paper= {{Paper |id=Vol-2225/paper10 |storemode=property |title=Towards Explanations for Visual Recommender Systems of Artistic Images |pdfUrl=https://ceur-ws.org/Vol-2225/paper10.pdf |volume=Vol-2225 |authors=Vicente Dominguez,Pablo Messina,Christoph Trattner,Denis Parra |dblpUrl=https://dblp.org/rec/conf/recsys/DominguezMTP18 }} ==Towards Explanations for Visual Recommender Systems of Artistic Images== https://ceur-ws.org/Vol-2225/paper10.pdf
      Towards Explanations for Visual Recommender Systems of
                          Artistic Images
                        Vicente Dominguez                                                          Pablo Messina
                           IMFD & PUC Chile                                                       IMFD & PUC Chile
                             Santiago, Chile                                                        Santiago, Chile
                           vidominguez@uc.cl                                                       pamessina@uc.cl

                         Christoph Trattner                                                          Denis Parra
                         University of Bergen                                                     IMFD & PUC Chile
                            Bergen, Norway                                                          Santiago, Chile
                       christoph.trattner@uib.no                                                   dparra@ing.puc.cl

ABSTRACT                                                                     1   INTRODUCTION
Explaining automatic recommendations is an active area of research           Online artwork recommendation has received little attention com-
since it has shown an important effect on users’ acceptance over             pared to other areas such as movies [1, 10], music [4, 16] or points-
the items recommended. However, there is a lack of research in               of-interest [25, 28, 29]. The first works in the area date from 2006-
explaining content-based recommendations of images based on                  2007 such as the CHIP [2] project, which implemented traditional
visual features. In this paper, we aim to fill this gap by testing three     techniques such as content-based and collaborative filtering for
different interfaces (one baseline and two novel explanation inter-          artwork recommendation at the Rijksmuseum, and the m4art sys-
faces) for artistic image recommendation. Our experiments with               tem by Van den Broek et al. [26], which used histograms of color
N=121 users confirm that explanations of recommendations in the              to retrieve similar artworks where the input query was a painting
image domain are useful and increase user satisfaction, perception           image. More recently, deep neural networks (DNN) have been used
of explainability, relevance, and diversity. Furthermore, our experi-        for artwork recommendation and are the current state-of-the-art
ments show that the results are also dependent on the underlying             model [7, 12], which is rather expected considering that DNNs are
recommendation algorithm used. We tested the interfaces with two             the top performing models for obtaining visual features for several
algorithms: Deep Neural Networks (DNN), with high accuracy but               tasks, such as image classification [15], and scene identification
with difficult to explain features, and the more explainable method          [23]. However, no user study has been conducted to validate the
based on Attractiveness Visual Features (AVF). The better the accu-          performance of DNNs versus other visual features. This aspect is
racy performance –in our case the DNN method– the stronger the               important since past works have shown that off-line results might
positive effect of the explainable interface. Notably, the explainable       not always replicate when tested with actual users [14, 17]. More-
features of the AVF method increased the perception of explainabil-          over, we provide evidence of the important value of explanations
ity but did not increase the perception of trust, unlike DNN, which          in artwork recommender systems over several dimensions of user
improved both dimensions. These results indicate that algorithms in          perception. Visual features obtained from DNNs are still difficult
conjunction with interfaces play a significant role in the perception        to explain to users, despite current efforts to understand them and
of explainability and trust for image recommendation. We plan to             explain them [20]. In contrast, features of visual attractiveness
further investigate the relationship between interface explainability        could be easily explained, based on color, brightness or contrast
and algorithmic performance in recommender systems.                          [21]. Explanations in recommender systems have been shown to
                                                                             have a significant effect on user satisfaction [24], and, to the best of
                                                                             our knowledge, no previous work has shown how to explain recom-
KEYWORDS                                                                     mendations of images based on visual features. Hence, there is no
Recommender systems, Artwork Recommendation, Explainable                     study of the effect on users when explaining images recommended
Interfaces, Visual Features                                                  by a Visual Content-based Recommender (Hereinafter, VCBR).
                                                                                Objective. In this paper, we research the effect of explaining
ACM Reference format:
                                                                             artistic image suggestions. In particular, we conduct a user study
Vicente Dominguez, Pablo Messina, Christoph Trattner, and Denis Parra.       on Amazon Mechanical Turk under three different interfaces and
2018. Towards Explanations for Visual Recommender Systems of Artistic        two different algorithms. The three interfaces are: i) no explana-
Images . In Proceedings of IntRS Workshop, Vancouver, Canada, October 2018   tions, ii) explanations based on similar images, and iii) explanations
(IntRS’18), 5 pages.                                                         based on visual features. Moreover, the two algorithms are: Deep
                                                                             Neural Networks (DNN) and Attractiveness Visual Features (AVF).
                                                                             In our study, we used images provided by the online store UGallery
                                                                             (http://www.UGallery.com/).
                                                                                Research Questions To drive our research, the following two
                                                                             questions were defined:
IntRS’18, October 2018, Vancouver, Canada                                                                                  Dominguez et al.




Figure 1: Interface 1: Baseline recom-
mendation interface without explana- Figure 2: Interface 2: Explainable recom- Figure 3: Interface 3: Explainable recom-
tions.                                 mendation interface with textual expla- mendation interface with features’ bar
                                       nations and top-3 similar images.       chart and top-1 similar image.
• RQ1. Given three different types of interfaces, one baseline          features, and ii) by a user study we validate off-line results stating
  interface without explanations and two with them, employing           the superiority of neural visual features compared to attractiveness
  similar image explanations and a feature bar chart, which one is      visual features over several dimensions, such as users’ perception
  perceived as most useful?                                             of explainability, relevance, trust and general satisfaction.
• RQ2. Furthermore, based on the visual and content-based rec-
                                                                        3     METHODS
  ommender algorithm chosen, are there observable differences in
  how the three interfaces are perceived?                               In the following section we describe in detail our study methods.
                                                                        First, we introduce the dataset chosen for the purpose of our study.
2   RELATED WORK                                                        Second we introduce the three different explainable visual interfaces
Relevant related research is collated in two sub-sections: First,       implemented which we evaluate. Third the two algorithms chosen
we review research on recommending artistic images to people.           for our study are revealed. Finally, the user study procedure is
Second we summarize studies on explaining recommender systems.          explained.
Both are important to our problem at hand. The final paragraph          3.1    Materials
in this section highlights the differences to previous work and our     For the purpose of our study we rely on a dataset provided by the
contributions to the existing literature in the area.                   online web store UGallery, which has been selling artwork for more
    Recommendations of Artistic Images. The works of Aroyo              than 10 years [27]. They support emergent artists by helping them
et al. [2] with the CHIP project and Semeraro et al. [22] with          sell their artwork online. For our research, UGallery provided us
FIRSt (Folksonomy-based Item Recommender syStem) made early             with an anonymized dataset of 1,371 users, 3,490 items and 2,846
contributions to this area using traditional techniques. More com-      purchases (transactions) of artistic artifacts, where all users have
plex methods were implemented recently by Benouaret et al. [3],         made at least one transaction. On average, each user bought 2-3
using context obtained through a mobile application, that makes         items over recent years .
a museum tour recommendation. Finally, the work of He et al.
addresses digital artwork recommendations based on pre-trained          3.2    The Explainable Recommender Interfaces
deep neural visual features [12], and the work of Dominguez et          In our study we explore the effect of explanations in visual content-
al. [7] and Messina et al. [18] compared neural against traditional     based artwork recommender systems. As such, our study contains
visual features. None of the aforementioned works performed a           conditions depending on how recommendations are displayed: i)
user study under explanation interfaces to generalize their results.    no explanations, as shown in Figure 1, ii) explanations given by
    Explaining Recommender Systems. There are some related              text and based on the top-3 most similar images a user liked in the
works on explanations for recommender systems [24]. Though a            past, as shown in Figure 2, and iii ) explanations employing a visual
good amount of research has been published in the area, to the best     attractiveness bar chart and showing the most similar image of the
of our knowledge, no previous research has conducted a user study       user’s item profile, as presented in Figure 3.
to understand the effect of explaining recommendation of artwork           In all three cases the interfaces are vertically scrollable. While
images based on different visual features. The closest works in         Interface 1 (baseline) is able to show 5 images in a row at the same
this aspect are researches oriented to automatically add caption to     time, interfaces 2 and 3 are capable of showing one recommended
images [9, 19] or to explain image classifications [13], but they are   image at the same time in one row to the user.
not directly related to personalized recommender systems.
    Differences to Previous Research & Contributions. Although          3.3    Visual Recommendation Approaches
we focus on artistic images, to the best of our knowledge this is       As mentioned earlier in this paper, we make use of two differ-
the first work which studies the effect of explaining recommen-         ent content-based visual recommender approaches in our work.
dations of images based on visual features. Our contributions are       The reason for choosing content-based methods over collaborative
two-fold: i) we analyze and report the positive effect of explaining    filtering-based methods is grounded in the fact that once an item is
artistic recommendations especially for the VCBR based on neural        sold via the UGallery store, it is not available anymore (every item
IntRS’18, October 2018, Vancouver, Canada                                                                                                                                Dominguez et al.
                                                Algorithm: Within subjects
                                                  (repeated measures)
                                                                                                             Table 1: Evaluation dimensions and statements asked in the
        preference
                                                                                                             post-study survey. Users indicated their agreement with the
         elicitation                                  Swap order of                                          statement on a scale from 0 to 100 (= totally agree).
                                                   algorithm randomly

                                             DNN                          AVF
                                                                                                                      Dimension         Statement
                       Interface 1   No explanation            No explanation                                                           I understood why the art images
                                                                                                                      Explainable
                       Interface 2   Explanation based on      Explanation based on top3        Interface:                              were recommended to me.
                                     top3 similar images       similar images                    Between
                                                                                                 subjects
                                                                                                                                        The art images recommended
                                                                                                                      Relevance
                       Interface 3   Explanation based on      Explanation based on                                                     matched my interests.
                                     top3 similar images       barchart of visual features                                              The art images recommended
                                                                                                                      Diverse
                                                                                                                                        were diverse.
                                                                                                                      Interface         Overall, I am satisfied with the
    pre-study                                         post-DNN                       post-AVF
                                                                                                                      Satisfaction      recommender interface.
     survey                                            survey                         survey                                            I would use this recommender system
                                                                                                                      Use Again
                                                                                                                                        again for finding art images in the future.
Figure 4: Study procedure. After the pre-study survey and                                                             Trust             I trusted the recommendations made.
the preference elicitation, users were assigned to one of
three possible interfaces. In each interface they evaluated                                                                                   (r )
                                                                                                             visual features (AVF). max        denotes the r -th maximum value,
recommendations of two algorithms: DNN and AVF.                                                              e.g., if r = 1 it is the overall maximum, if r = 2 it is the second
is unique) and hence traditional collaborative filtering approaches                                          maximum, and so on. We compute the average similarity of the
do not apply.                                                                                                top-K most similar images because as shown in Messina et al. [18],
   DNN Visual Feature (DNN) Algorithm. The first algorith-                                                   for different users, the recommendations match better using smaller
mic approach we employed was based on image similarity, itself                                               subsets of the entire user profile. Users do not always look to buy a
based on features extracted with a deep neural network. The output                                           painting similar to one they bought before, but they look for one
vector representing the image is usually called an image’s visual                                            that resembles a set of artworks that they liked. sim(Vi , Vj ) denotes
embedding. The visual embedding in our experiment was a vector                                               a similarity function between vectors Vi and Vj . In this particular
of features obtained from an AlexNet, a convolutional deep neural                                            case, the similarity function used was cosine similarity:
network developed to classify images [15]. In particular, we use an                                                                                              Vi ⋅ Vj
AlexNet model pre-trained with the ImageNet dataset [6]. Using                                                                sim(Vi , Vj ) = cos(Vi , Vj ) =                         (2)
                                                                                                                                                                ∥Vi ∥∥Vj ∥
the pre-trained weights, for every image a vector of 4,096 dimen-
sions was generated with the Caffe (http://caffe.berkeleyvision.org/)                                           Both methods use the same formula to calculate the recommen-
framework. We resized every image to a 227x227 image. This is the                                            dations. The difference is in the origin of the visual features. For
standard pre-processing needed to use the AlexNet.                                                           the DNN method, the features were extracted with the AlexNet
   Attractiveness Visual Features (AVF) Algorithm. The sec-                                                  [15], and in the case of AVF, the features were extracted based on
ond content-based algorithmic recommender approach employed                                                  San Pedro et al. [21].
was a method based on visual attractiveness features. San Pedro
                                                                                                             3.4    User Study Procedure
and Siersdorfer in [21] proposed several explainable visual features
that to a great extent, can capture the attractiveness of an image                                           To evaluate the performance of our explainable interfaces we con-
posted on Flickr. Following their procedure, for every image in                                              ducted a user study in Amazon Mechanical Turk using a 3x2 mixed
our UGallery dataset we calculated: (a) average brightness, (b) sat-                                         design: 3 interfaces (between-subjects) and 2 algorithms (within-
uration, (c) sharpness, (d) RMS-contrast, (e) colorfulness and (f)                                           subjects, DNN and AVF). The interface conditions were: Interface
naturalness. In addition, we added (g) entropy, which is a good way                                          1: interface without explanations, as in Figure 1; Interface 2: each
to characterize and measure the texture of an image [11]. These                                              item recommendation is explained based on the top 3 most similar
metrics have also been used in another study [8], where we show                                              images in the user profile, as in Figure 2; and Interface 3: only for
how to nudge people with attractive images to take up more healthy                                           AVF, based on a bar chart of visual features, as in Figure 3. Notice
recipe recommendations. To compute these features, we used the                                               that in the condition Interface 3, for DNN we used the explanation
original size of the images and did not pre-process them.                                                    based on top 3 most similar images, because the neural embedding
   Due space constrains, the details to calculate the features are                                           of 4,096 dimensions has no human-interpretable features to show
described in the article by Messina et al. [18]                                                              in a bar chart.
   Computing Recommendations. Given a user u who has con-                                                       To compute the recommendations for each of the three interface
sumed a set of artworks Pu , a constrained profile size K, and an                                            conditions two recommender algorithms were chosen: one based
arbitrary artwork i from the inventory, the score of this item i to                                          on DNN visual features, and the other based on attractiveness visual
be recommended to u is:                                                                                      features (AVF). The order in which the algorithms were presented
                                                                                                             was chosen at random to diminish the chance of a learning effect.
                               min{K,∣Pu ∣}
                                                        (r )              X      X                              The full study procedure is shown in Figure 4. Participants
                                      ∑         max            {sim(Vi , Vj )}
                                     r =1       jϵ Pu                                                        accepted the study on Mechanical Turk (https://www.mturk.com)
        score(u, i)X =                                                                  ,        (1)
                                               min{K, ∣Pu ∣}                                                 and were redirected to a web application. After accepting a consent
                                                                                                             form, they are redirected to the pre-study survey, which collects
          X
where Vz is a feature vector of item z obtained with method X ,                                              demographic data (age, gender) and a subject’s previous knowledge
where X can be either a pre-trained AlexNet (DNN) or attractiveness                                          of art based on the test by Chatterjee et al. [5].
IntRS’18, October 2018, Vancouver, Canada                                                                                                                            Dominguez et al.

Table 2: Results of users’ perception over several evaluation dimensions, defined in Table 1 . Scale 1-100 (higher is better),
                                                                                                                       1
except for Average rating (scale 1-5). DNN: Deep Neural Network, and AVF: Attractiveness visual features. The symbol ↑ indi-
cates interface-wise significant difference (differences between interfaces using the same algorithms). The ∗ symbol denotes
algorithm-wise statistical difference (comparing a dimension between algorithms, using the same interface).
                                                                                                                      Interface
                                                      Explainable              Relevance           Diverse                           Use Again            Trust         Average Rating
                                                                                                                     Satisfaction
    Condition                                         DNN        AVF           DNN        AVF     DNN        AVF     DNN AVF        DNN     AVF       DNN     AVF       DNN     AVF
    Interface 1
                                                      66.2*      51.4          69.0*      53.6     46.1      69.4*    69.9   62.1    65.8   59.7       69.3   63.7      3.55*   3.23
    (No Explanations)
    Interface 2                                              1           1
                                                    83.5*↑       74.0↑         80.0*      61.7     58.8      69.9*   76.6*   61.7   76.1*   65.9      75.9*   62.7      3.67*   3.00
    (DNN & AVF: Top-3 similar images)
    Interface 3                                              1           1            1                  1
                                                    84.2*↑       70.4↑       82.3*↑       56.2   65.3↑       71.2    69.9*   63.3   78.2*   58.7      77.7*   55.4      3.90*   2.99
    (DNN: Top-3 similar, AVF: feature bar chart)
    Stat. significance between interfaces by multiple t-tests, Bonferroni corr. αbonf = α/n = 0.05/3 = 0.0017. Stat. significance between algorithms using pairwise t-test, α = 0.05.

   Following this, they had to perform a preference elicitation task.                               is more transparent, since it explains exactly what is used to recom-
In this step, the users had to “like” at least ten paintings, using                                 mend (brightness, saturation, sharpness, etc.). People report that
a Pinterest-like interface. Next, they were randomly assigned to                                    they understand why the images are recommended (70.4), but since
one interface condition. In each condition, they again provided                                     the relevance is rather insufficient (56.2), the perception of trust is
feedback (rating with 1-5 scale to each image) to top ten recom-                                    reported as low (55.4).
mendations of images with employing either the DNN or the AVF                                          Differences between Algorithms. With the only exception of
algorithm (also assigned at random as discussed before). Finally,                                   the dimension Diverse where AVF was significantly better, DNN
the participants were asked to next answer a post-algorithm survey.                                 was perceived more positively than AVF at large. In interfaces
The dimensions evaluated in the post-algorithm survey are the                                       2 and 3, the DNN method was perceived significantly better in 5
same for DNN and AVF algorithms, and they are shown in Table                                        dimensions (explainability, relevance, interface satisfaction, interest
1. This process is repeated for the second algorithm as well. Once                                  for eventual use, and trust), as well as higher average rating.
the participants finished answering the second post study survey,                                      Overall, the results indicate that the explainable interface based
they were redirected to the final view, where they received a survey                                on top 3 similar images works better than an interface without
code for later payment in Amazon Mechanical Turk.                                                   explanation. Moreover, this effect is enhanced by the accuracy of
                                                                                                    the algorithm, so even if the algorithm has no explainable features
4     RESULTS                                                                                       (DNN) it could induce more trust if the user perceives a larger
The study was finished by in total 200 users out of which 121 were                                  predictive preference accuracy.
able to answer our validation questions successfully and hence were                                 5         CONCLUSIONS & FUTURE WORK
included in the results. In total, we had two validation questions set
                                                                                                    In this paper, we have studied the effect of explaining recommenda-
to check for attention of our study participants. Filtering out users
                                                                                                    tion of images employing three different recommender interfaces,
not responding properly to these questions allowed us to include 41
                                                                                                    as well as interactions with two different visual content-based rec-
users for the Interface 1 condition, 41 users for Interface 2 condition
                                                                                                    ommendation algorithms: one with high predictive accuracy but
and 39 users for Interface 3 condition. In total, participants were
                                                                                                    with unexplainable features (DNN), and another with lower accu-
paid an amount of 0.40 USD per study, which took them around 10
                                                                                                    racy but with higher potential for explainable features (AVF).
minutes to complete.
                                                                                                       The first result, which answers RQ1, shows that explaining the
   Our subjects were between 18 to over 60 years old. 36% were
                                                                                                    images recommended has a positive effect vs. no explanation. More-
between 25 to 32 years old, and 29% between 32 to 40 years old.
                                                                                                    over, the explanation based on top 3 similar images presents the
Females made up 55.4% . 12% just finished high school, 31% had
                                                                                                    best results, but we need to consider that the alternative method,
a some college degree, 57% had a bachelor’s, master’s or Ph.D.
                                                                                                    explanations based on visual features, was only used with the AVF.
degree. Only 8% reported some visual impairment. W.r.t. their
                                                                                                    This result is preliminary and opens a path of research in terms of
understanding about art, 20% had null experience, 48% had attended
                                                                                                    new interfaces which could help to explain the features learned by
1 or 2 lessons, and 32% reported to have attended 3 or more at high
                                                                                                    a deep neural network of images.
school level or above. 20% of our subjects also reported that they
                                                                                                       Regarding RQ2, we see that the algorithm used plays an im-
had almost never visited a museum or an art gallery; 36% do this
                                                                                                    portant role in conjunction with the interface. DNN is perceived
once a year; and 44% do this once every 1 or 6 months.
                                                                                                    better than AVF in most dimensions evaluated, showing that further
   Differences between Interfaces. Table 2 summarizes the re-
                                                                                                    research should focus on the interaction between algorithm and
sults of the user study. First we compared interface performance
                                                                                                    explainable interfaces. In the future we will expand this work to
and then we looked at the algorithmic performance. The explainable
                                                                                                    other datasets, beyond artistic images, to generalize our results.
interfaces (Interface 2 and 3) significantly improved the perception
of explainability compared to Interface 1 under both algorithms.
There is also a significant improvement over Interface 1 in terms                                   6         ACKNOWLEDGEMENTS
of relevance and diversity, but this is only achieved by the DNN                                    The authors from PUC Chile were funded by Conicyt, Fondecyt
method when this is compared against the AVF method using the                                       grant 11150783, as well as by the Millennium Institute for Founda-
interface 3. Interestingly, this is the condition where the interface                               tional Research on Data (IMFD).
IntRS’18, October 2018, Vancouver, Canada                                                                                                                Dominguez et al.


REFERENCES                                                                                [17] Sean M McNee, Nishikant Kapoor, and Joseph A Konstan. 2006. Don’t look
 [1] Xavier Amatriain. 2013. Mining large streams of user data for personalized                stupid: avoiding pitfalls when recommending research papers. In Proceedings
     recommendations. ACM SIGKDD Explorations Newsletter 14, 2 (2013), 37–48.                  of the 2006 20th anniversary conference on Computer supported cooperative work.
 [2] LM Aroyo, Y Wang, R Brussee, Peter Gorgels, LW Rutledge, and N Stash. 2007.               ACM, 171–180.
     Personalized museum experience: The Rijksmuseum use case. In Proceedings of          [18] Pablo Messina, Vicente Dominguez, Denis Parra, Christoph Trattner, and Al-
     Museums and the Web.                                                                      varo Soto. 2018. Content-Based Artwork Recommendation: Integrating Paint-
 [3] Idir Benouaret and Dominique Lenne. 2015. Personalizing the Museum Experi-                ing Metadata with Neural and Manually-Engineered Visual Features. User
     ence through Context-Aware Recommendations. In Systems, Man, and Cybernet-                Modeling and User-Adapted Interaction (2018). DOI:http://dx.doi.org/10.1007/
     ics (SMC), 2015 IEEE International Conference on. IEEE, 743–748.                          s11257-018-9206-9
 [4] Oscar Celma. 2010. Music recommendation. In Music Recommendation and                 [19] Margaret Mitchell, Xufeng Han, Jesse Dodge, Alyssa Mensch, Amit Goyal, Alex
     Discovery. Springer, 43–85.                                                               Berg, Kota Yamaguchi, Tamara Berg, Karl Stratos, and Hal Daumé, III. 2012.
 [5] Anjan Chatterjee, Page Widick, Rebecca Sternschein, William Smith II, and                 Midge: Generating Image Descriptions from Computer Vision Detections. In
     Bianca Bromberger. 2010. The Assessment of Art Attributes. 28 (07 2010),                  Proceedings of the 13th Conference of the European Chapter of the Association for
     207–222.                                                                                  Computational Linguistics (EACL ’12). Association for Computational Linguis-
 [6] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Ima-         tics, Stroudsburg, PA, USA, 747–756. http://dl.acm.org/citation.cfm?id=2380816.
     genet: A large-scale hierarchical image database. In Computer Vision and Pattern          2380907
     Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 248–255.                     [20] Chris Olah, Alexander Mordvintsev, and Ludwig Schubert. 2017. Feature
 [7] Vicente Dominguez, Pablo Messina, Denis Parra, Domingo Mery, Christoph                    Visualization. Distill (2017). DOI:http://dx.doi.org/10.23915/distill.00007
     Trattner, and Alvaro Soto. 2017. Comparing Neural and Attractiveness-based                https://distill.pub/2017/feature-visualization.
     Visual Features for Artwork Recommendation. In Proceedings of the Workshop           [21] Jose San Pedro and Stefan Siersdorfer. 2009. Ranking and Classifying Attrac-
     on Deep Learning for Recommender Systems, co-located at RecSys 2017. DOI:                 tiveness of Photos in Folksonomies. In Proceedings of the 18th International
     http://dx.doi.org/10.1145/3125486.3125495 arXiv:arXiv:1706.07515                          Conference on World Wide Web (WWW ’09). ACM, New York, NY, USA, 771–780.
 [8] David Elsweiler, Christoph Trattner, and Morgan Harvey. 2017. Exploiting food             DOI:http://dx.doi.org/10.1145/1526709.1526813
     choice biases for healthier recipe recommendation. In Proceedings of the 40th        [22] Giovanni Semeraro, Pasquale Lops, Marco De Gemmis, Cataldo Musto, and
     international acm sigir conference on research and development in information             Fedelucio Narducci. 2012. A folksonomy-based recommender system for person-
     retrieval. ACM, 575–584.                                                                  alized access to digital artworks. Journal on Computing and Cultural Heritage
 [9] Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr               (JOCCH) 5, 3 (2012), 11.
                                                                                          [23] Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson.
     Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John Platt, and others.
                                                                                               2014. CNN features off-the-shelf: an astounding baseline for recognition. In
     2015. From captions to visual concepts and back. (2015).
                                                                                               Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
[10] Carlos A Gomez-Uribe and Neil Hunt. 2016. The netflix recommender system:
                                                                                               Workshops. 806–813.
     Algorithms, business value, and innovation. ACM Transactions on Management
                                                                                          [24] Nava Tintarev and Judith Masthoff. 2015. Explaining recommendations: Design
     Information Systems (TMIS) 6, 4 (2016), 13.
                                                                                               and evaluation. In Recommender Systems Handbook. Springer, 353–382.
[11] Rafael C Gonzalez, Steven L Eddins, and Richard E Woods. 2004. Digital Image
                                                                                          [25] Christoph Trattner, Alexander Oberegger, Lukas Eberhard, Denis Parra, Lean-
     Publishing Using MATLAB. Prentice Hall.
                                                                                               dro Marinho, and others. 2016. Understanding the Impact of Weather for POI
[12] Ruining He, Chen Fang, Zhaowen Wang, and Julian McAuley. 2016. Vista: A
                                                                                               Recommendations. Proceedings of RecTour Workshop, co-located at ACM RecSys
     Visually, Socially, and Temporally-aware Model for Artistic Recommendation.
                                                                                               (2016).
     In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys
                                                                                          [26] Egon L van den Broek, Thijs Kok, Theo E Schouten, and Eduard Hoenkamp. 2006.
     ’16). ACM, New York, NY, USA, 309–316. DOI:http://dx.doi.org/10.1145/2959100.
                                                                                               Multimedia for art retrieval (m4art). In Multimedia Content Analysis, Management,
     2959152
                                                                                               and Retrieval 2006, Vol. 6073. International Society for Optics and Photonics,
[13] Lisa Anne Hendricks, Zeynep Akata, Marcus Rohrbach, Jeff Donahue, Bernt
                                                                                               60730Z.
     Schiele, and Trevor Darrell. 2016. Generating visual explanations. In European
                                                                                          [27] Deborah Weinswig. 2016. Art Market Cooling, But Online Sales Boom-
     Conference on Computer Vision. Springer, 3–19.
                                                                                               ing. https://www.forbes.com/sites/deborahweinswig/2016/05/13/art-market-coo
[14] Joseph A Konstan and John Riedl. 2012. Recommender systems: from algorithms
                                                                                               ling-but-online-sales-booming/. (2016). [Online; accessed 21-March-2017].
     to user experience. User Modeling and User-Adapted Interaction 22, 1-2 (2012),
                                                                                          [28] Mao Ye, Peifeng Yin, Wang-Chien Lee, and Dik-Lun Lee. 2011. Exploiting
     101–123.
                                                                                               geographical influence for collaborative point-of-interest recommendation. In
[15] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classifica-
                                                                                               Proceedings of the 34th international ACM SIGIR conference on Research and
     tion with deep convolutional neural networks. In Advances in neural information
                                                                                               development in Information Retrieval. ACM, 325–334.
     processing systems. 1097–1105.
                                                                                          [29] Quan Yuan, Gao Cong, Zongyang Ma, Aixin Sun, and Nadia Magnenat Thalmann.
[16] Pattie Maes and others. 1994. Agents that reduce work and information overload.
                                                                                               2013. Time-aware point-of-interest recommendation. In Proceedings of the 36th
     Commun. ACM 37, 7 (1994), 30–40.
                                                                                               international ACM SIGIR conference on Research and development in information
                                                                                               retrieval. ACM, 363–372.