=Paper=
{{Paper
|id=Vol-1245/paper2
|storemode=property
|title=HybridRank: A Hybrid Content-Based Approach To Mobile Game Recommendations
|pdfUrl=https://ceur-ws.org/Vol-1245/cbrecsys2014-paper02.pdf
|volume=Vol-1245
|dblpUrl=https://dblp.org/rec/conf/recsys/ChowFM14
}}
==HybridRank: A Hybrid Content-Based Approach To Mobile Game Recommendations==
<pdf width="1500px">https://ceur-ws.org/Vol-1245/cbrecsys2014-paper02.pdf</pdf>
<pre>
HybridRank : A Hybrid Content-Based Approach To Mobile
                Game Recommendations

                  Anthony Chow                         Min-Hui Nicole, Foo,               Giuseppe Manai, Ph.D.
                 Group Digital Life                          Ph.D.                              Group Digital Life
                    Singapore                             Group Digital Life                       Singapore
              Telecommunications Ltd                         Singapore                       Telecommunications Ltd
            awjchow@singtel.com                        Telecommunications Ltd              giuseppe@singtel.com
                                                     nicolefoo@singtel.com

ABSTRACT                                                               corporation of both content-based and user-based signals to
The massive number of mobile games available necessitates              generate recommendations. Online evaluations were con-
a technique to help the consumer find the right game at the            ducted and results yield that the approach presented per-
right time. This paper introduces HybridRank, a novel hy-              formed best when compared against user-based collabora-
brid algorithm to deliver recommendations for mobile games.            tive filtering[16] and content-based filtering approaches.[15]
This technique is based on a personalised random walk ap-
proach, with the incorporation of both content-based and               2. RELATED WORK
user-based information in the formulation of the recommen-
                                                                       Recommendation algorithms can be broadly classified into
dations. This technique is evaluated against traditional neigh-
                                                                       two well known techniques: collaborative filtering methods
bourhood based collaborative filtering and content-based rec-
                                                                       and content-based methods. User-user collaborative filtering
ommendation algorithms. This paper also explores the fact
                                                                       starts by placing the user in a vector space of their explicit
that this algorithm can also be used to help alleviate the
                                                                       and implicit activities. A nearest neighbour algorithm with
cold start problem that is associated with little user data.[1]
                                                                       a defined distance metric is then applied to identify what
Online evaluations were conducted and results yield that the
                                                                       items the users might like based on behaviours of users that
approach presented performed the best in both a controlled
                                                                       are most similar to them. Such an algorithm typically faces
testing environment as well as in live production. This algo-
                                                                       issues of data sparsity [9] and algorithm complexity[5].
rithm is currently implemented in a live mobile game plat-
form developed by Singapore Telecommunications Ltd called
                                                        Unlike collaborative filtering, content based recommenda-
WePlay.
                                                        tion systems takes the approach of item-to-item correlation.
                                                        With this technique, the system learns to recommend items
Categories and Subject Descriptors                      that are similar to the ones that the user liked in the past.
H.4 [Recommender Systems]: personalised pagerank,random The similarity is calculated based on features associated
walk,online evaluation,cold-start,hybrid recommender    with the items being compared.

General Terms                                                          There is also an increasing focus on the ability to combine
Experimentation, Algorithms, Measurement                               both user and content metadata together in a hybrid way to
                                                                       generate recommendations such as using Naive Bayes [1] or
                                                                       clustering techniques [7]. Graph based approaches of using
1. INTRODUCTION                                                        concept graphs [12] and Markov Chains [6] have also been
Singapore Telecommunications Ltd (SingTel) launched We-                presented. Such approaches usually model the users as a
Play, a mobile game app store in early 2014. With an in-               bipartite graph where the nodes are users and items, and
creasing number of mobile games available to the consumer,             a link is drawn between the nodes if a user has done an
there is a need for the development of a mobile game rec-              activity on the item, i.e. watched a movie, liked a restaurant
ommendation system. Key work on this domain has been                   or listened to a song.
done by Xbox [2] and others [3]. This paper presents a novel
approach, HybridRank, which is a hybrid content-based rec-             This paper explores new approaches towards the graph-based
ommender system using a biased random walk model. This                 hybrid recommender system problem. It draws heavily on
is an adaptation of the ItemRank algorithm [10] for the in-            the Pagerank algorithm [8]. This algorithm has been adapted
                                                                       by ItemRank [10], as well as LBSNRank[4].The Pagerank al-
                                                                       gorithm computes an importance score for each node, and
                                                                       one can use this important score as a measure of importance
                                                                       within the network to be used to provide recommendations.


Copyright 2014 for the individual papers by the paper’s authors.       3. PROBLEM DESCRIPTION
Copying permitted for private and academic purposes. This volume is    A recommender system deals with a set of users ui where
published and
CBRecSys      copyrighted
          2014, October 6, by its editors.
                           2014,  Silicon Valley, CA, USA.             i = 1, ..., n and a set of items pj where j = 1, ..., m. For each
CBRecSys 2014
Copyright 2014,byOctober 6, 2014, Silicon Valley, CA, USA.
                  the author(s)                                        user, item pair, (ui , pj ) the system generates a score that


                                                                  10
will describe the relationship between ui and pj captured           matrix, the normalised matrices are not symmetric. The
in a relationship score rij . To generate this score, this pa-      diagonals are also 0 by definition. These two correlation
per proposes HybridRank, an algorithm that combines both            matrices become valuable graphic model to indicate correla-
items features and user behaviours in a novel way for recom-        tions between games. The weights associated with the links
mendations. The key steps in setting up the algorithm is to         provide approximate measures of games relation.
generate two matrices based on user and feature correlation
respectively.
                                                                    4.   HYBRIDRANK: THE ALGORITHM
                                                                    The idea underlying HybridRank is that a hybrid combi-
3.1 User and Feature Correlation Matrices                           nation of both the user and feature correlation graph can
The user correlation graph Gu draws the correlation between         be used to forecast the user preferences in a content-based
games via user co-occurrence, i.e, a link appears between 2         approach. The personalised page rank algorithm has been
games gi and gj if one or more users have downloaded both           shown to be a good algorithm to be used for such a use case
games. It is noted that multiple user co-occurrence matrices        as in [10]. This algorithm o↵ers key properties of propa-
can be developed, where links between users can be drawn            gation and attenuation. Utilising the relationships between
not only when they have downloaded both games, but also             games, captured by both the feature and user correlation
the frequency of which they have viewed and played the              matrices, Ũ and F̃, the personalised page rank algorithm
games on the platform. For these co-occurrence matrices,            is able to propagate preferences through the graph from a
define matrix M 2 Rn⇤m where n is number of games and m             given starting point. As the preferences move further away
is number of users. Mxij represents the number of times a           from the seed nodes, the influence of the user preferences
user ui has conducted an action x on the game gj . Examples         diminishes, and such an attenuation property is aptly cap-
of actions would be viewed, downloaded or played. Their             tured by the said algorithm. The personalised page rank
respective correlation matrix can then be generated via the         algorithm is defined as below:
inner product of the matrices as follows:


                                                                                  P Rui = ↵ · M · P Rui + (1       ↵) · dui             (3)
                        Ux = Mx · M|x                       (1)

                                                                    P Rui refers to the personalised page rank vector for a partic-
The feature correlation graph Gf on the other hand draws
                                                                    ular user ui , which gives an indication of the importance the
the correlation between games via feature co-occurrence, i.e.
                                                                    di↵erent nodes in the system to the user ui . M refers to the
a link appears between two games gi and gj if both games
                                                                    stochastic matrix which captures the connectivity between
share one or more metatags. For example, two games that
                                                                    all the nodes in the system. This paper uses the feature cor-
share the same developer, or price points will share a link.
                                                                    relation matrix F̃ to represent the connections between the
The feature set Fij is defined as the set of features which be-
                                                                    games. The vector dui is often referred to as the teleport
long to both gi and gj where i 6= j and i, j 2 Savailablegames .
                                                                    vector, which allows the introduction of bias into the system
These features can typically be generated from two sources.
                                                                    to a given user ui . This generates a static score distribution
The first source will be structured information provided by
                                                                    vector of all the items that user has consumed or has an
the developer, the second being user generated content like
                                                                    opinion for. For example, the j th element of the vector dui
reviews. This paper focuses on utilising structured metatags
                                                                    will be 1 if the user ui has downloaded the game, and 0
provided by the former source.
                                                                    otherwise. The vector will then be normalised to sum to 1.
It is noted there are some features that are more impor-
                                                                    The HybridRank algorithm builds on this idea by introduc-
tant in determining game similarity as compared to others,
                                                                    ing the user correlation matrix U to build this vector dui .
for example two games sharing the same game mechanics is
                                                                    Let the set Dudli , Duv i and Dup i be the set of games that the
considered more similar than two games sharing the same
                                                                    user ui has downloaded, viewed and played respectively. The
price point. [14]. As such, a set of weights k associated
                                                                    vector dui can be defined as follows:
with each feature can be defined. This set of weights can
be learned, defined via empirical experiments or assigned
via a TF-IDF on the content vector of the items[13]. Specif-                       X                X                      X
ically for the experiments conducted a hierarchy of metadata         djui =   ·             ˜ +⌘·
                                                                                           Udl              Uv˜jk + ✓ ·           Up˜jk (4)
                                                                                               jk
was defined, and weights were assigned from qualitative user                         dl
                                                                                  k2Du                 v
                                                                                                    k2Du                      p
                                                                                                                          k2Dui
                                                                                       i                i
feedback. With that, the feature correlation matrix can be
defined as follows:
                                                                    Next normalise the vector to sum to one, to obtain d˜ui . For
                                                                    this paper, equal weights have been assigned to the weights
                                n
                                X                                     , ⌘ and ✓ and further optimisation is underway. In the
                        Fij =         k 1k                  (2)     simple case, where the user has only downloaded one game,
                                k=1                                 the vector d˜ui will simply be the j th column of the matrix
                                                                    U corresponding to the game that the user has downloaded.
1k will be 1 if both games gi and gj share feature k, and 0         This draws upon not only the user preferences, but also as-
otherwise. We normalise both matrices, U and F column-              signs a bias towards games that are close in relationship to
wise to generate stochastic matrices Ũ and F̃, such that each      the games selected via a simple collaborative approach or
column sums up to 1. While the former is a symmetrical              captured via the user co-occurrence matrix.


                                                               10
Linear algebra approaches via power iteration can be used            Type           Definitions                            No of
to solve equation 3. There has been research to improve                                                                    testers
computation efficiency, one of them being in [17]. Also, in          Trend          Testers are at the forefront and ac-   34
terms of complexity, [10] has shown that such a computation          Seeker and     tively contribute online reviews to
is efficient from both computation and memory resources.             Contributor    share knowledge
                                                                     (Grp1)
5. EXPERIMENT RESULTS                                                Trend          Testers are at the forefront and       77
The HybridRank algorithm was developed and deployed in               Seeker         less actively/seldom contribute
two separate live experiments in relation to mobile game             (Grp2)         online reviews to share knowledge
recommendations. The first experiment was done in a con-             Contributor    Testers who are not at the fore-       41
trolled fashion with a preloaded web prototype with a group          (Grp3)         front but are actively involved in
of 526 users. The second experiment was done on the pro-                            information exchanges with like-
duction app WePlay with over 100,000 users in Indonesia.                            minded gamers through online re-
                                                                                    views
                                                                     Social         Testers who take heed from what        128
5.1 Online Evaluation 1                                              (Grp4)         their friends/social circle play
The seed dataset has 78099 users and 199 games. Each game            Indi↵erent     Testers who just want to stay in       246
also comes with its set of 149 set of metadata, including tags       (Grp5)         the game and are indi↵erent to
that are temporal in nature, i.e. whether it is trending, top                       what others say
grossing or curated by marketing team for that week. The
following shows the distribution of the tags available. As           Table 2: Distribution and definitions of user testing groups
from equation 2, weights were assigned to several top-level
categories. This was purely done via qualitative assumptions
and reasoning.
                                                                    Table 3 shows the performance results of the four algorithms
                                                                    across the five di↵erent segments. Success of the algorithms
              Tag Type                   Tag Count
                                                                    were measured by comparing the lift in average number of
              Developers                     79         0.2         games selected against baseline. It can be seen that Hy-
              Categories                     25         0.3         bridRank provided maximal lift in segments of users who
             Price Ranges                    11        0.05         are indi↵erent and social gamers. For the small segment of
            ESBN Ratings                      6         0.1         users who are trend seekers, it appears that the top grossing
            In-App Goods                      2        0.05         algorithm performed the best. This could be because the
    Others (Motivations, Goals etc)          26         0.3         HybridRank did not take into consideration global market
                                                                    features in the development of the item metadata.
Table 1: Distribution of metatags available and weights assigned
                                                                           Grp1     HyRank      CF        TopG       Baseline
                                                                          Average     2.12      1.56       1.88        1.15
A testing portal was developed and sent to 526 digitally                   Lift     +0.846     +0.359    +0.641         0
savvy members of SingTel Digital Advisor Panel1 . These
                                                                           Grp2     HyRank      CF        TopG       Baseline
users were asked to select up to five mobile games that they
like, and four separate lists of recommendations were pro-                Average    1.756     1.536      1.878       0.927
vided generated by HybridRank, kNN Collaborative Filter-                   Lift     +0.895     +0.658     +1.03         0
ing, Top Grossing and Baseline. The baseline algorithm                     Grp3     HyRank      CF        TopG       Baseline
randomly selects games from the entire catalog. Each set                  Average    1.922     1.481      1.805       0.974
of recommendations exposed seven games. The users were                     Lift     +0.973     +0.52     +0.853         0
asked to then choose the mobile games they like across all
                                                                           Grp4     HyRank      CF        TopG       Baseline
the lists provided.
                                                                          Average    2.023     1.585      1.781       0.914
To evaluate the results, users were segmented2 across the di-              Lift     +1.214     +0.735    +0.9487        0
mensions of externalised gratifications and internalised fulfil-           Grp5     HyRank      CF        TopG       Baseline
ment. The former comprises of factors associated with basic               Average    1.671     1.276      1.459       0.825
progression in the game, scoring, beating the competition.                 Lift      +1.02     +0.546    +0.768         0
The latter involves the altruistic sharing of knowledge and
experience, helping others in game progression and gaining          Table 3: Results of recommendations lift across the di↵erent
respect and trusted recognition. Table 2 gives the definition       groups
and distribution of users across the segments.
1
  This is a panel of 15,000 users across South East Asia main-      5.2     Online Evaluation 2
tained by SingTel Group Digital Life to help in testing of new      In this second evaluation, the algorithm was exposed to over
digital products. The users in this study have been screened        100,000 users in Indonesia in the live WePlay app with over
to have played at least a mobile game in the past one month.
2
  This framework is an ongoing research by the team in              900 games to recommend from. The section evaluated rec-
Group Digital Life SingTel in an e↵ort to deeper understand         ommends games that are similar to the selected game. This
the gamer’s psyche, fundamentally based on Maslow0 s Hier-          particular use-case can be likened to the cold start prob-
archy of Needs [11]                                                 lem, where there are no previous preferences of the user and


                                                               11
the only preference being the current selected game. The              Systems: technologies and applications, pages 104–113,
HybridRank algorithm was compared with two other algo-                2012.
rithms. The first being a commonly used content-based algo-       [4] Zhaoyan Jin et al. Lbsnrank: personalized pagerank
rithm via an euclidean distance metric on the feature vector          on location-based social networks. Proceedings of the
of the games as in [15] and the second being a baseline that          2012 ACM Conference on Ubiquitous Computing,
chooses the more popular items within the same category               pages 980–987, 2012.
as the selected game. The experiment was conducted live           [5] Diego Fernandez Fidel Cacheda, Victor Carneiro and
on the platform in Indonesia for a period of one month in             Vreixo Formoso. Comparison of collaborative filtering
an out of time validation fashion. The entire base was ex-            algorithms: Limitations of current techniques and
posed to the algorithms in an alternating day fashion. The            proposals for scalable, high-performance recommender
click through rates of the suggested game were measured               systems. ACM Transactions on the Web (TWEB),
- the higher the click through rate, the more e↵ective the            5(1):2:1–2:33, 2011.
algorithm was considered to be.                                   [6] Francois Fouss, Alain Pirotte, Jean michel Renders,
                                                                      and Marco Saerens. Random-walk computation of
    Country     Baseline   Content-based    HybridRank                similarities between nodes of a graph, with application
   Indonesia     6.3%          7.1%           13.3%                   to collaborative recommendation. IEEE Transactions
      Lift         0          +0.127          +1.11                   on Knowledge and Data Engineering, 19:2007, 2006.
                                                                  [7] Byeong Man Kim, Qing Li, Chang Seok Park,
Table 4: Evaluation of algorithms on live production environ-         Si Gwan Kim, and Kim Ju Yeon. A new approach for
ment                                                                  combining content-based and collaborative filters. J.
                                                                      Intell. Inf. Syst., 27(1):79–91, 2006.
                                                                  [8] R. Motwani L. Page, S. Brin and T. Winograd. The
From the results HybridRank was shown to serve as a bet-              pagerank citation ranking: Bringing order to the web.
ter algorithm in recommending games in a user cold-start              Technical report, Stanford University, 1998.
scenario. The hybrid approach of using both user and fea-         [9] Dimitris Plexousakis Manos Papagelis and
ture correlation proved superior to the typical content-based         Themistoklis Kutsuras. Alleviating the sparsity
approach.                                                             problem of collaborative filtering using trust
                                                                      inferences. iTrust’05 Proceedings of the Third
                                                                      international conference on Trust Management, pages
6. CONCLUSION                                                         224–239, 2005.
This paper presents HybridRank, a personalised pagerank
                                                                 [10] Augusto Pucci Marco Gori. Itemrank: A random-walk
approach that incorporates both content metadata and user
                                                                      based scoring algorithm for recommender engines.
collaborative features in a novel approach. The algorithm
                                                                      International Joint Conferences on Artificial
was compared against state of the art collaborative filter-
                                                                      Intelligence, pages 2766–2771, 2007.
ing algorithms as well as content based approaches in live
                                                                 [11] A. H. Maslow. A theory of human motivation.
environment, with the conclusion that the hybrid approach
                                                                      Psychological Review, 50:370–396, 1943.
performs better against the algorithms that were compared
against. Also the algorithm proved to be able to help alle-      [12] Le Quang Thang Nguyen Duy Phuong and Tu Minh
viate the cold start problem. Future work will include the            Phuong. A graph-based method for combining
incorporation of context such as user location, global trends         collaborative and content-based filtering. PRICAI ’08
in mobile gaming as well as custom curated metadata to the            Proceedings of the 10th Pacific Rim International
approach.                                                             Conference on Artificial Intelligence: Trends in
                                                                      Artificial Intelligence, pages 859 – 869, 2008.
                                                                 [13] Incheon Paik and Hiroshi Mizugai. Recommendation
7. ACKNOWLEDGEMENTS                                                   system using weighted tf-idf and naive bayes classifiers
The authors would like to thank the WePlay team for the               on rss contents. JACIII, 14:631–637, 2010.
data and allowing us to conduct evaluation of our algorithms     [14] Rapeepisarn K. Paireekreng, W. and K.W. Wong.
on the live app.                                                      Personalised mobile game recommendation system.
                                                                      6th International Game Design and Technology
8. REFERENCES                                                         Workshop and Conference, 2008.
 [1] Andrew I. Schein et al. Methods and metrics for             [15] F. et al. Ricci. Recommender systems handbook,
     cold-start recommendations. Proceedings of the 25th              chapter Content-based Recommender Systems: State
     annual international ACM SIGIR conference on                     of the Art and Trends. Springer, Berlin, 2011.
     Research and development in information retrieval           [16] Bell R.M. and Koren Y. Improved neighborhood based
     (SIGIR ’02), pages 253–260, 2002.                                collaborative filtering. KDD 2007 Netflix Competition
 [2] Noam Koenigstein et al. The xbox recommender                     Workshop, 2007.
     system. Proceedings of the sixth ACM conference on          [17] Christopher D Manning Sepandar D Kamvar, Taher
     Recommender systems (RecSys ’12), pages 281–284,                 H Haveliwala and Gene H Golub. Extrapolation
     2012.                                                            methods for accelerating pagerank computations.
 [3] Pavle Skocir et al. The mars - a multi-agent                     WWW ’03 Proceedings of the 12th international
     recommendation system for games on mobile phones.                conference on World Wide Web, pages 261–270, 2003.
     KES-AMSTA’12 Proceedings of the 6th KES
     international conference on Agent and Multi-Agent


                                                            12

</pre>