=Paper= {{Paper |id=None |storemode=property |title=Graph Embeddings for Movie Visualization and Recommendation |pdfUrl=https://ceur-ws.org/Vol-891/interfacers12_submission_1.pdf |volume=Vol-891 }} ==Graph Embeddings for Movie Visualization and Recommendation== https://ceur-ws.org/Vol-891/interfacers12_submission_1.pdf
               Graph Embeddings for Movie Visualization and
                           Recommendation

                  Michail Vlachos                                                      Daniel Svonava
          IBM Research - Zurich, Switzerland                               Slovak University of Technology, Slovakia



ABSTRACT                                                                        ISOMAP [12]. The goal of those projection methods is
In this work we showcase how graph-embeddings can be                            to preserve all distances approximately, while our approach
used as a movie visualization and recommendation inter-                         preserves a subset of distances (spanning-tree distances) ex-
face. The proposed low-dimensional embedding carefully                          actly.
preserves both local and global graph connectivity structure.                     We use the proposed graph embedding as the entry point
The approach additionally offers: a) recommendations based                       for a movie recommendation interface. Our methodology al-
on a pivot movie, b) interactive deep graph exploration of                      lows the exploratory visualization of the movie graph space
the movie connectivity graph, c) automatic movie trailer re-                    and incorporates additional capabilities, such a filtering of
trieval.                                                                        the graph based on various criteria, real-time graph explo-
                                                                                ration and automatic retrieval of the movie trailer from the
                                                                                Internet.
1. INTRODUCTION
   As we are moving gradually from the era of information                       2.   DESCRIPTION OF OUR APPROACH
to the era of recommendation, interactive interfaces that                          The proposed method combines the visual simplicity and
engage the users and let them easily discover the data of                       comprehensibility of the minimum spanning tree (MST) with
interest will be of increasing importance. In this work we                      the grouping properties of hierarchical clustering approaches
explore how graph-embedding techniques can be used as the                       (dendrograms). Given pairwise distances describing the re-
basis of an interactive recommendation engine for movies.                       lationship of high-dimensional objects, the objective is to
   Several recommendation systems have appeared in the lit-                     preserve a subset of important distances as well as possi-
erature in the recent years for recommending videos [5, 9]                      ble on 2D, while at the same time accurately conveying the
or movies [2, 6, 8, 10]. These, however, rarely focus on                        cluster information.
visually driven interfaces. In our scenario, we use a movie-
actor database as the underlying graph structure. We use
                                                                                                                                                    
textual features to describe the movie objects. Given a se-                                                                                          !"#

lected pivot movie the system can retrieve a set of similar
movies which are portrayed and clustered on two dimen-                                                 
                                                                                                             
                                                                                                            
                                                                                                             
                                                                                                             
                                                                                                             
                                                                                                             
                                                                                                             
                                                                                                                 

sions. Our method presents a novel way of capturing both                                                              
                                                                                                                        
                                                                                                                        
                                                                                                                        
                                                                                                                         
                                                                                                                        
                                                                                                                        
                                                                                                                               
                                                                                                                               

neighborhood and cluster structure. Neighborhood informa-                               
                                                                                                              
                                                                                                                                    


tion is preserved by retaining the Minimum Spanning Tree                                                                       
                                                                                                                               



(MST) structure on two-dimensions. This also partially pre-                                  

                                                                                                                                                    
serves the global graph structure, as the MST represents the
                                                                                                                                         
dataset ‘backbone’. In addition to the neighborhood struc-
ture, the method also retains the cluster structure which can
be visualized at different granularities. Finally, the proposed                  Figure 1. Conceptual illustration of our approach. The span-
mapping can accommodate both metric and non-metric dis-                         ning tree structure is preserved and projected onto a 2D plane,
tance functions.                                                                while cluster structure is overlayed in the third dimension. ‘Cut-
   Data-embedding techniques have been extensively used                         ting’ the dendrogram at a certain level projects the cluster
for visualizing high-dimensional data. Examples include the                     structure onto 2D.
Bourgain embedding [3], FastMap [7], BoostMap [1] and
                                                                                  Our approach works as follows. First, we construct a Min-
                                                                                imum Spanning Tree (MST) layout on the 2D plane in such
                                                                                a way that all distances to a user-selected pivot point and
                                                                                the neighborhood distances on the MST are exactly pre-
                                                                                served. This construction carefully considers how to best
Paper presented at the Workshop on Interfaces for Recommender Systems           portray the original object relationships for either metric or
2012, in conjunction with the 6th ACM conference on Recommender Sys-            non-metric distance functions. Secondly, the dendrogram
tems. Copyright @ 2012 for the individual papers is held by the papers’         cluster hierarchy is constructed so that it can be positioned
authors. This volume is published and copyrighted by its editors. Inter-
face@RecSys’12, September 13, 2012, Dublin, Ireland                             exactly on top of the MST mapping of objects. The cluster
.                                                                               hierarchy can be frozen at any resolution level (tomographic


                                                                           56
                  Visualization Method                      Trailer   Filters               Special Features
   netflix.com     Tables                                    Yes       Limited               Very large user-base
     jinni.com    Linear Treemap                            No        Yes                   Semantic search
   IMDB.com       Icon Tiles                                Yes       No                    Comprehensive database
 MovieLens.org    Tables                                    No        Yes                   Recommendations through initial test
  Our method      Minimum Spanning Dendrograms              Yes       Rating/Genre/Year     Clustering/Deep Graph Exploration

                            Table 1. Overview of features for several extant movie recommender systems


view) to convey the multi-granular clusters that are formed.               2(a). The fourth point is mapped at the intersection of the
This concept is elucidated in Figure 1: objects are properly               circles centered at A and C (Fig. 2(b)) and the fifth point
mapped on the 2D plane whereas on the third dimension                      is mapped similarly (Fig. 2(c)). The process continues until
the hierarchy of the clustering structure is portrayed. Nat-               all the points of the ST are positioned on the 2D plane and
urally, this is only a conceptual illustration of our approach.            the final result is shown in Fig. 2 (d).
In practice, cluster information is also projected onto two                   The mapping technique presented will retain exactly the
dimensions, e.g., by properly coloring the nodes belonging                 distances between all points and the pivot sequence, and
to the same cluster. Therefore, by ‘cutting’ the dendrogram                also between the nodes that lie at the edges of the spanning
derived on a user-defined level, clusters on 2D can be formed,              tree. This creates a powerful visualization technique that
expanded and contracted appropriately, as the user drills up               not only allows to preserve nearest neighbor distances (local
or down on the cluster hierarchy.                                          structure), but in addition retains distances with respect to
  The work presented here constitutes an demo prototype                    a single reference point, providing the option of a global data
of hhe visualization technique presented in [11]. Here we                  view using that object as a pivot.
explore in more detail how the proposed high-dimensional
data embedding methodology can be used as the interface                    Layout optimization using simulated annealing: For
of a movie search engine. In Table 1 we present briefly the                 reaching maximum visual clarity we try to minimize the
differences of our approach with respect to prevalent movie                 number of intersected graph edges. Recall that when tri-
search and recommendation systems.                                         angulating the position of a third point to its neighbor and
                                                                           the pivot, the algorithm proceeds by identifying the intersec-
2.1 Neighborhood Preservation                                              tion between two circles. One can readily see this in Figure
   We will first explain how to capture on two dimensions the               3; there are two positions in which a newly mapped point
relationship between a set of high-dimensional objects. As                 can be placed.
not all pairwise distances can be retained on two dimensions                                                     

we choose to maintain, as well as possible, the spanning tree
distances which partially capture the local relationships and                                       
                                                                                                                
also record information about the general global structure
[11].                                                                                                            
   We begin by constructing the spanning tree on the origi-
nal high-dimensional objects. One object is selected as pivot              Figure 3. Selecting which of the two positions a new point is
and mapped in the center of the 2D plane coordinate sys-                   mapped to
tem. By traversing the spanning tree, objects are positioned                  We employ a probabilistic global optimization technique
on the 2D plane by triangulating the distances to two ob-                  based on simulated annealing (SA) [4] that intelligently se-
jects: the pivot object and the neighboring point previously               lects which of the two mapping positions to use, so as to
mapped on the spanning tree.                                               minimize the number of crossed edges. SA is an effective op-
                                                  
                                                                          timization method when the search space is discrete, which
                                                                         is exactly the situation we face. Our experiments on the
                                                                           movie graph database suggest that the simulated annealing
                                                         
                                                                           process is very effective in reaching an improved layout.

                                                   
                                                                           Using Non-metric Distances: When the underlying dis-
                                                                           tance measure obeys the triangle inequality the circles around
   Figure 2. Two dimensional mapping of MST of objects                     the reference points are guaranteed to intersect. However,
                                                                           many widely used distance functions (e.g., dynamic warp-
Example: Suppose the first two points (A and B) of the                      ing, longest common subsequence) violate the triangle in-
MST have already been mapped, as shown in Fig. 2(a).                       equality, and thus the corresponding reference circles may
Let’s assume that the second distance preserved per object                 not necessarily intersect. In such cases, one needs to iden-
is the distance to a reference point, which in our case is the             tify the position where to place an object with respect to
first point. The third point is mapped at the intersection                  the two circles, in such a way that the object is mapped as
of circles centered at the reference points. The circles are               close as possible to the circumference of both circles. We
centered at A and B with radii of d(A, C) – the distance                   need to identify the locus of points that minimize the sum
between points A and C– and d(B, C), respectively. Owing                   of distances to the perimeters of two circles. One can show
to the triangle inequality, the circles either intersect in two            that the desired locus always lies on the line connecting the
positions or are tangent. Any position on the intersection of              centers of the two circles. An example is shown in Fig. 4.
the circles will retain the original distances towards the two
reference points. The position of point C is shown in Fig.                 2.2    Cluster Preservation


                                                                      57
                                                                      3.   GRAPHICAL INTERFACE
                                                                         On top of the proposed visualization methodology we have
                                               C    A2
                                                                      built a movie recommendation engine that allows the in-
                                                                      teractive exploration of a large movie graph. Our sample
                             A1
                                                                      database consists of 125,000 movies and 955,000 actors. We
                  L
                                                                      augment the information on each movie by attaching ad-
                                                                      ditional unstructured information from the web. So, each
                                                                      movie is described by a bag-of-words pertinent to the genre,
                                                                      actors, director(s), language, and a set of keywords relevant
Figure 4. Discovering the mapping point for non-metric dis-           to the plot. To evaluate movie similarities we follow an IR-
tances if the circles do not intersect.                               driven approach by considering the cosine similarity between
                                                                      the bag-of-words. More complex functions can be also ac-
                                                                      commodated, however this approach already gave very sat-
   Now we turn our attention to capturing and conveying               isfactory and intuitive results. So, similarity between movies
the cluster information on 2D. Recall that the input for the          is based solely on the content of the movie rather than on
algorithm is a matrix of pairwise distances. Based on the             any collaborative features (or ratings). We follow this path
pairwise distances given, one can build a hierarchical den-           in an effort to introduce a factor of serendipity in the rec-
drogram. The dendrogram construction is based on a single             ommendation process. Like this, more obscure movies can
linkage approach that merges closest singleton objects and            appear in the recommendation if they share similar content
clusters.                                                             (e.g. plot or mood) with the one that the user selected. In
                                                                      general, we have noticed that movie recommender systems
                                                           that consider collaborative features (e.g. users that have
                                                           rated highly movie A have also rated highly movie B) tend
                          
                                                                      to have a strong bias toward blockbuster movies that most
                                                                      users (typically) have already watched.
                                                                      GUI and Functionalities: The data visualization is ac-
                                                                      commodated through a graphical web interface, shown in
                                                                      Figure 6. The interface allows the user to search for a movie,
                                                                      and subsequently displays the proposed visualization graph,
                                                                      placing the movie selected as the center (pivot) object. The
                                                                      user can then easily identify other relevant movies, with the
                                                                      option to retrieve detailed information about the movie, such
                                                                      as participating actors or the movie plot. The user can mod-
                                                      ify the number of displayed movie clusters or even watch the
                                                         movie trailer. The application allows the exploration of both
                                    
                                                            sides of the movie-actor bipartite graph: either by deeper ex-
                                                          ploring of the movie graph (by clicking on a movie) or by
                                                                      searching/filtering for movies of a particular actor (by click-
Figure 5. Cluster information conveyed by variable thresh-            ing on an actor’s image). Additional filtering functionalities
olds on a dendrogram information, thus imposing ‘zoom-in’
and ‘zoom-out’ in the cluster structure.                              include: a) Filtering by rating, e.g., when interested in re-
                                                                      trieving movies with a rating > Y . b) Filtering by year,
                                                                      when the user is interested only in recent movie releases.
   Given the minimum spanning tree, the clustering process               Below we provide specific examples of our mapping, high-
can be sped up, because the merging order of the MST algo-            lighting its ability to be used as an effective and interactive
rithm is the same as the merging order of the single linkage          movie recommendation system. Number of clusters can be
hierarchical clustering approach. In addition, this reflects           interactively modified and clusters are conveyed using vary-
the fact that one can achieve an exact registration of the            ing border and edge colors.
constructed hierarchy on top of the MST-mapped points,
                                                                      Examples: Selecting ‘The Titanic’ as pivot movie produces
because the clustering order is the same as the order crys-
                                                                      the 2D mapping shown on the left-hand side of Fig. 7.
tallized in the spanning-tree mapping. This can be verified
                                                                      The user can navigate through the graph and identify sim-
in Fig. 1 where the single linkage dendrogram is positioned
                                                                      ilar movies. Clustered with Titanic are the movies: ‘Ti-
exactly on top of the spanning-tree mapping.
                                                                      tanic (1953)’, ‘Poseidon’ and ‘Shakespeare in Love’. An-
   The above observation combined with the hierarchical clus-
                                                                      other cluster displayed, shown in light green, includes movies
ter information provides the capacity for multi-granular clus-
                                                                      like: ‘The Notebook’, ‘Atonement’, ‘Purple Rain’; all ro-
ter views on 2D by interactively setting variable threshold
                                                                      mantic drama movies. Similarly, selecting ‘Star Wars’ as
levels on the resulting dendrogram. In this way, a ‘tomo-
                                                                      pivot movie correctly packs closely the remaining Star Wars
graphic view’ of the clusters with formation of variable size
                                                                      movies (Fig. 7, right-hand side). An adjacent cluster in-
clusters is possible. The concept is illustrated in Fig. 5:
                                                                      cludes parodies of Star Wars, like ‘Spaceballs’ or ‘Thumb
by cutting the dendrogram at a lower threshold, six clus-
                                                                      Wars’. Other related movies include adventure films like
ters are created on the left. Imposing a higher threshold,
                                                                      the ‘Lord of the Rings’ trilogy, or sci-fi action thrillers like
clusters are merged progressively as shown on the right side.
                                                                      ‘Aliens’ and ‘Star Trek’. Additional illustrative examples
Our prototype implementation conveys cluster information
                                                                      can be found in the provided video.
by coloring the node perimeters and the connected edges.



                                                                 58
                                                                                                                

                       
                                                        




                                                                 

                                                                  




                                                           
                    


                                 Figure 6. The interface of the Movie Recommendation web application




 Figure 7. Using the proposed mapping on the movie graph. Left: Pivot movie is ’Titanic’. Right: ‘Star Wars’ selected as pivot


Benefits of simulated annealing (SA): The proposed SA                             [2] J. Bennett and S. Lanning. The Netflix Prize. In Proc. of
component probes multiple node placement configurations,                              KDD Cup and Workshop, 2007.
                                                                                 [3] J. Bourgain. On Lipschitz embeddings of finite metric spaces
and picks the one that minimizes the number of intersected                           in Hilbert space. In Israel Journal of Mathematics, 52, pages
edges. To measure its effectiveness we select 1000 movies                             46–52, 1985.
at random and retrieve the k nearest neighbors. Table 2                          [4] V. Cerny. A thermodynamical approach to the travelling
                                                                                     salesman problem: an efficient simulation algorithm. In J.
reports the median number of edge intersections with and                             of Opt. Theory and Applications 45, pages 41–51, 1985.
without the SA component. We observe that the SA imple-                          [5] J. Davidson et al. The YouTube Video Recommendation Sys-
mentation significantly reduces the edge intersections and                            tem. In Proc. of ACM Recommender Systems, pages 293–
hence the screen clutter, providing a more intelligent node                          296, 2010.
                                                                                 [6] E. A. Eyjolfsdottir, G. Tilak, and N. Li. MovieGEN: A Movie
placement. Note, that both variations provide the same dis-                          Recommendation System. In UC Santa Barbara: Technical
tance preservation with respect to the pivot and the MST                             Report, 2010.
distances.                                                                       [7] C. Faloutsos and K. I. Lin. FastMap: A fast algorithm for
                                                                                     indexing, data-mining and visualization of traditional and
Table 2. Node placement using simulated annealing (SA) sig-                          multimedia datasets. In Proc. of SIGMOD, 1995.
                                                                                 [8] F. Gruvstad, N. Gupta, and S. Agrawal. Shiniphy - Visual
nificantly reduces the number of edge intersections                                   Data Mining of movie recommendations. In Stanford Uni-
                                                                                     versity: Technical Report, 2009.
                           Graph edge intersections                              [9] T. Mei, B. Yang, X.-S. Hua, L. Yang, S.-Q. Yang, and S. Li.
                           Without SA  With SA                                       VideoReach: an online video recommendation system. In
              k=20         36          12                                            Proc. of SIGIR, pages 767–768, 2007.
              k=30         108         48                                       [10] B. N. Miller, I. Albert, S. K. Lam, J. A. Konstan, and
              k=40         222         112                                           J. Riedl. MovieLens Unplugged: Experiences with an Occa-
              k=50         346         212                                           sionally Connected Recommender System. In Proc. of IUI,
                                                                                     pages 263–266, 2003.
                                                                                [11] D. Svonava and M. Vlachos. Graph Visualization using Min-
References                                                                           imum Spanning Dendrograms. In IEEE International Con-
                                                                                     ference on Data Mining, 2010.
 [1] V. Athitsos, J. Alon, S. Sclaroff, and G. Kollios. BoostMap:                [12] J. B. Tenenbaum, V. de Silva, and J. C. Langford. A global
     An Embedding Method for Efficient Nearest Neighbor Re-                            geometric framework for nonlinear dimensionality reduction.
     trieval. In IEEE Trans. Pattern Anal. Mach. Intell. 30(1),                      Science 290, pages 2319–2323, 2000.
     pages 89–104, 2008.



                                                                           59