=Paper=
{{Paper
|id=Vol-1245/paper2
|storemode=property
|title=HybridRank: A Hybrid Content-Based Approach To Mobile Game Recommendations
|pdfUrl=https://ceur-ws.org/Vol-1245/cbrecsys2014-paper02.pdf
|volume=Vol-1245
|dblpUrl=https://dblp.org/rec/conf/recsys/ChowFM14
}}
==HybridRank: A Hybrid Content-Based Approach To Mobile Game Recommendations==
HybridRank : A Hybrid Content-Based Approach To Mobile
Game Recommendations
Anthony Chow Min-Hui Nicole, Foo, Giuseppe Manai, Ph.D.
Group Digital Life Ph.D. Group Digital Life
Singapore Group Digital Life Singapore
Telecommunications Ltd Singapore Telecommunications Ltd
awjchow@singtel.com Telecommunications Ltd giuseppe@singtel.com
nicolefoo@singtel.com
ABSTRACT corporation of both content-based and user-based signals to
The massive number of mobile games available necessitates generate recommendations. Online evaluations were con-
a technique to help the consumer find the right game at the ducted and results yield that the approach presented per-
right time. This paper introduces HybridRank, a novel hy- formed best when compared against user-based collabora-
brid algorithm to deliver recommendations for mobile games. tive filtering[16] and content-based filtering approaches.[15]
This technique is based on a personalised random walk ap-
proach, with the incorporation of both content-based and 2. RELATED WORK
user-based information in the formulation of the recommen-
Recommendation algorithms can be broadly classified into
dations. This technique is evaluated against traditional neigh-
two well known techniques: collaborative filtering methods
bourhood based collaborative filtering and content-based rec-
and content-based methods. User-user collaborative filtering
ommendation algorithms. This paper also explores the fact
starts by placing the user in a vector space of their explicit
that this algorithm can also be used to help alleviate the
and implicit activities. A nearest neighbour algorithm with
cold start problem that is associated with little user data.[1]
a defined distance metric is then applied to identify what
Online evaluations were conducted and results yield that the
items the users might like based on behaviours of users that
approach presented performed the best in both a controlled
are most similar to them. Such an algorithm typically faces
testing environment as well as in live production. This algo-
issues of data sparsity [9] and algorithm complexity[5].
rithm is currently implemented in a live mobile game plat-
form developed by Singapore Telecommunications Ltd called
Unlike collaborative filtering, content based recommenda-
WePlay.
tion systems takes the approach of item-to-item correlation.
With this technique, the system learns to recommend items
Categories and Subject Descriptors that are similar to the ones that the user liked in the past.
H.4 [Recommender Systems]: personalised pagerank,random The similarity is calculated based on features associated
walk,online evaluation,cold-start,hybrid recommender with the items being compared.
General Terms There is also an increasing focus on the ability to combine
Experimentation, Algorithms, Measurement both user and content metadata together in a hybrid way to
generate recommendations such as using Naive Bayes [1] or
clustering techniques [7]. Graph based approaches of using
1. INTRODUCTION concept graphs [12] and Markov Chains [6] have also been
Singapore Telecommunications Ltd (SingTel) launched We- presented. Such approaches usually model the users as a
Play, a mobile game app store in early 2014. With an in- bipartite graph where the nodes are users and items, and
creasing number of mobile games available to the consumer, a link is drawn between the nodes if a user has done an
there is a need for the development of a mobile game rec- activity on the item, i.e. watched a movie, liked a restaurant
ommendation system. Key work on this domain has been or listened to a song.
done by Xbox [2] and others [3]. This paper presents a novel
approach, HybridRank, which is a hybrid content-based rec- This paper explores new approaches towards the graph-based
ommender system using a biased random walk model. This hybrid recommender system problem. It draws heavily on
is an adaptation of the ItemRank algorithm [10] for the in- the Pagerank algorithm [8]. This algorithm has been adapted
by ItemRank [10], as well as LBSNRank[4].The Pagerank al-
gorithm computes an importance score for each node, and
one can use this important score as a measure of importance
within the network to be used to provide recommendations.
Copyright 2014 for the individual papers by the paper’s authors. 3. PROBLEM DESCRIPTION
Copying permitted for private and academic purposes. This volume is A recommender system deals with a set of users ui where
published and
CBRecSys copyrighted
2014, October 6, by its editors.
2014, Silicon Valley, CA, USA. i = 1, ..., n and a set of items pj where j = 1, ..., m. For each
CBRecSys 2014
Copyright 2014,byOctober 6, 2014, Silicon Valley, CA, USA.
the author(s) user, item pair, (ui , pj ) the system generates a score that
10
will describe the relationship between ui and pj captured matrix, the normalised matrices are not symmetric. The
in a relationship score rij . To generate this score, this pa- diagonals are also 0 by definition. These two correlation
per proposes HybridRank, an algorithm that combines both matrices become valuable graphic model to indicate correla-
items features and user behaviours in a novel way for recom- tions between games. The weights associated with the links
mendations. The key steps in setting up the algorithm is to provide approximate measures of games relation.
generate two matrices based on user and feature correlation
respectively.
4. HYBRIDRANK: THE ALGORITHM
The idea underlying HybridRank is that a hybrid combi-
3.1 User and Feature Correlation Matrices nation of both the user and feature correlation graph can
The user correlation graph Gu draws the correlation between be used to forecast the user preferences in a content-based
games via user co-occurrence, i.e, a link appears between 2 approach. The personalised page rank algorithm has been
games gi and gj if one or more users have downloaded both shown to be a good algorithm to be used for such a use case
games. It is noted that multiple user co-occurrence matrices as in [10]. This algorithm o↵ers key properties of propa-
can be developed, where links between users can be drawn gation and attenuation. Utilising the relationships between
not only when they have downloaded both games, but also games, captured by both the feature and user correlation
the frequency of which they have viewed and played the matrices, Ũ and F̃, the personalised page rank algorithm
games on the platform. For these co-occurrence matrices, is able to propagate preferences through the graph from a
define matrix M 2 Rn⇤m where n is number of games and m given starting point. As the preferences move further away
is number of users. Mxij represents the number of times a from the seed nodes, the influence of the user preferences
user ui has conducted an action x on the game gj . Examples diminishes, and such an attenuation property is aptly cap-
of actions would be viewed, downloaded or played. Their tured by the said algorithm. The personalised page rank
respective correlation matrix can then be generated via the algorithm is defined as below:
inner product of the matrices as follows:
P Rui = ↵ · M · P Rui + (1 ↵) · dui (3)
Ux = Mx · M|x (1)
P Rui refers to the personalised page rank vector for a partic-
The feature correlation graph Gf on the other hand draws
ular user ui , which gives an indication of the importance the
the correlation between games via feature co-occurrence, i.e.
di↵erent nodes in the system to the user ui . M refers to the
a link appears between two games gi and gj if both games
stochastic matrix which captures the connectivity between
share one or more metatags. For example, two games that
all the nodes in the system. This paper uses the feature cor-
share the same developer, or price points will share a link.
relation matrix F̃ to represent the connections between the
The feature set Fij is defined as the set of features which be-
games. The vector dui is often referred to as the teleport
long to both gi and gj where i 6= j and i, j 2 Savailablegames .
vector, which allows the introduction of bias into the system
These features can typically be generated from two sources.
to a given user ui . This generates a static score distribution
The first source will be structured information provided by
vector of all the items that user has consumed or has an
the developer, the second being user generated content like
opinion for. For example, the j th element of the vector dui
reviews. This paper focuses on utilising structured metatags
will be 1 if the user ui has downloaded the game, and 0
provided by the former source.
otherwise. The vector will then be normalised to sum to 1.
It is noted there are some features that are more impor-
The HybridRank algorithm builds on this idea by introduc-
tant in determining game similarity as compared to others,
ing the user correlation matrix U to build this vector dui .
for example two games sharing the same game mechanics is
Let the set Dudli , Duv i and Dup i be the set of games that the
considered more similar than two games sharing the same
user ui has downloaded, viewed and played respectively. The
price point. [14]. As such, a set of weights k associated
vector dui can be defined as follows:
with each feature can be defined. This set of weights can
be learned, defined via empirical experiments or assigned
via a TF-IDF on the content vector of the items[13]. Specif- X X X
ically for the experiments conducted a hierarchy of metadata djui = · ˜ +⌘·
Udl Uv˜jk + ✓ · Up˜jk (4)
jk
was defined, and weights were assigned from qualitative user dl
k2Du v
k2Du p
k2Dui
i i
feedback. With that, the feature correlation matrix can be
defined as follows:
Next normalise the vector to sum to one, to obtain d˜ui . For
this paper, equal weights have been assigned to the weights
n
X , ⌘ and ✓ and further optimisation is underway. In the
Fij = k 1k (2) simple case, where the user has only downloaded one game,
k=1 the vector d˜ui will simply be the j th column of the matrix
U corresponding to the game that the user has downloaded.
1k will be 1 if both games gi and gj share feature k, and 0 This draws upon not only the user preferences, but also as-
otherwise. We normalise both matrices, U and F column- signs a bias towards games that are close in relationship to
wise to generate stochastic matrices Ũ and F̃, such that each the games selected via a simple collaborative approach or
column sums up to 1. While the former is a symmetrical captured via the user co-occurrence matrix.
10
Linear algebra approaches via power iteration can be used Type Definitions No of
to solve equation 3. There has been research to improve testers
computation efficiency, one of them being in [17]. Also, in Trend Testers are at the forefront and ac- 34
terms of complexity, [10] has shown that such a computation Seeker and tively contribute online reviews to
is efficient from both computation and memory resources. Contributor share knowledge
(Grp1)
5. EXPERIMENT RESULTS Trend Testers are at the forefront and 77
The HybridRank algorithm was developed and deployed in Seeker less actively/seldom contribute
two separate live experiments in relation to mobile game (Grp2) online reviews to share knowledge
recommendations. The first experiment was done in a con- Contributor Testers who are not at the fore- 41
trolled fashion with a preloaded web prototype with a group (Grp3) front but are actively involved in
of 526 users. The second experiment was done on the pro- information exchanges with like-
duction app WePlay with over 100,000 users in Indonesia. minded gamers through online re-
views
Social Testers who take heed from what 128
5.1 Online Evaluation 1 (Grp4) their friends/social circle play
The seed dataset has 78099 users and 199 games. Each game Indi↵erent Testers who just want to stay in 246
also comes with its set of 149 set of metadata, including tags (Grp5) the game and are indi↵erent to
that are temporal in nature, i.e. whether it is trending, top what others say
grossing or curated by marketing team for that week. The
following shows the distribution of the tags available. As Table 2: Distribution and definitions of user testing groups
from equation 2, weights were assigned to several top-level
categories. This was purely done via qualitative assumptions
and reasoning.
Table 3 shows the performance results of the four algorithms
across the five di↵erent segments. Success of the algorithms
Tag Type Tag Count
were measured by comparing the lift in average number of
Developers 79 0.2 games selected against baseline. It can be seen that Hy-
Categories 25 0.3 bridRank provided maximal lift in segments of users who
Price Ranges 11 0.05 are indi↵erent and social gamers. For the small segment of
ESBN Ratings 6 0.1 users who are trend seekers, it appears that the top grossing
In-App Goods 2 0.05 algorithm performed the best. This could be because the
Others (Motivations, Goals etc) 26 0.3 HybridRank did not take into consideration global market
features in the development of the item metadata.
Table 1: Distribution of metatags available and weights assigned
Grp1 HyRank CF TopG Baseline
Average 2.12 1.56 1.88 1.15
A testing portal was developed and sent to 526 digitally Lift +0.846 +0.359 +0.641 0
savvy members of SingTel Digital Advisor Panel1 . These
Grp2 HyRank CF TopG Baseline
users were asked to select up to five mobile games that they
like, and four separate lists of recommendations were pro- Average 1.756 1.536 1.878 0.927
vided generated by HybridRank, kNN Collaborative Filter- Lift +0.895 +0.658 +1.03 0
ing, Top Grossing and Baseline. The baseline algorithm Grp3 HyRank CF TopG Baseline
randomly selects games from the entire catalog. Each set Average 1.922 1.481 1.805 0.974
of recommendations exposed seven games. The users were Lift +0.973 +0.52 +0.853 0
asked to then choose the mobile games they like across all
Grp4 HyRank CF TopG Baseline
the lists provided.
Average 2.023 1.585 1.781 0.914
To evaluate the results, users were segmented2 across the di- Lift +1.214 +0.735 +0.9487 0
mensions of externalised gratifications and internalised fulfil- Grp5 HyRank CF TopG Baseline
ment. The former comprises of factors associated with basic Average 1.671 1.276 1.459 0.825
progression in the game, scoring, beating the competition. Lift +1.02 +0.546 +0.768 0
The latter involves the altruistic sharing of knowledge and
experience, helping others in game progression and gaining Table 3: Results of recommendations lift across the di↵erent
respect and trusted recognition. Table 2 gives the definition groups
and distribution of users across the segments.
1
This is a panel of 15,000 users across South East Asia main- 5.2 Online Evaluation 2
tained by SingTel Group Digital Life to help in testing of new In this second evaluation, the algorithm was exposed to over
digital products. The users in this study have been screened 100,000 users in Indonesia in the live WePlay app with over
to have played at least a mobile game in the past one month.
2
This framework is an ongoing research by the team in 900 games to recommend from. The section evaluated rec-
Group Digital Life SingTel in an e↵ort to deeper understand ommends games that are similar to the selected game. This
the gamer’s psyche, fundamentally based on Maslow0 s Hier- particular use-case can be likened to the cold start prob-
archy of Needs [11] lem, where there are no previous preferences of the user and
11
the only preference being the current selected game. The Systems: technologies and applications, pages 104–113,
HybridRank algorithm was compared with two other algo- 2012.
rithms. The first being a commonly used content-based algo- [4] Zhaoyan Jin et al. Lbsnrank: personalized pagerank
rithm via an euclidean distance metric on the feature vector on location-based social networks. Proceedings of the
of the games as in [15] and the second being a baseline that 2012 ACM Conference on Ubiquitous Computing,
chooses the more popular items within the same category pages 980–987, 2012.
as the selected game. The experiment was conducted live [5] Diego Fernandez Fidel Cacheda, Victor Carneiro and
on the platform in Indonesia for a period of one month in Vreixo Formoso. Comparison of collaborative filtering
an out of time validation fashion. The entire base was ex- algorithms: Limitations of current techniques and
posed to the algorithms in an alternating day fashion. The proposals for scalable, high-performance recommender
click through rates of the suggested game were measured systems. ACM Transactions on the Web (TWEB),
- the higher the click through rate, the more e↵ective the 5(1):2:1–2:33, 2011.
algorithm was considered to be. [6] Francois Fouss, Alain Pirotte, Jean michel Renders,
and Marco Saerens. Random-walk computation of
Country Baseline Content-based HybridRank similarities between nodes of a graph, with application
Indonesia 6.3% 7.1% 13.3% to collaborative recommendation. IEEE Transactions
Lift 0 +0.127 +1.11 on Knowledge and Data Engineering, 19:2007, 2006.
[7] Byeong Man Kim, Qing Li, Chang Seok Park,
Table 4: Evaluation of algorithms on live production environ- Si Gwan Kim, and Kim Ju Yeon. A new approach for
ment combining content-based and collaborative filters. J.
Intell. Inf. Syst., 27(1):79–91, 2006.
[8] R. Motwani L. Page, S. Brin and T. Winograd. The
From the results HybridRank was shown to serve as a bet- pagerank citation ranking: Bringing order to the web.
ter algorithm in recommending games in a user cold-start Technical report, Stanford University, 1998.
scenario. The hybrid approach of using both user and fea- [9] Dimitris Plexousakis Manos Papagelis and
ture correlation proved superior to the typical content-based Themistoklis Kutsuras. Alleviating the sparsity
approach. problem of collaborative filtering using trust
inferences. iTrust’05 Proceedings of the Third
international conference on Trust Management, pages
6. CONCLUSION 224–239, 2005.
This paper presents HybridRank, a personalised pagerank
[10] Augusto Pucci Marco Gori. Itemrank: A random-walk
approach that incorporates both content metadata and user
based scoring algorithm for recommender engines.
collaborative features in a novel approach. The algorithm
International Joint Conferences on Artificial
was compared against state of the art collaborative filter-
Intelligence, pages 2766–2771, 2007.
ing algorithms as well as content based approaches in live
[11] A. H. Maslow. A theory of human motivation.
environment, with the conclusion that the hybrid approach
Psychological Review, 50:370–396, 1943.
performs better against the algorithms that were compared
against. Also the algorithm proved to be able to help alle- [12] Le Quang Thang Nguyen Duy Phuong and Tu Minh
viate the cold start problem. Future work will include the Phuong. A graph-based method for combining
incorporation of context such as user location, global trends collaborative and content-based filtering. PRICAI ’08
in mobile gaming as well as custom curated metadata to the Proceedings of the 10th Pacific Rim International
approach. Conference on Artificial Intelligence: Trends in
Artificial Intelligence, pages 859 – 869, 2008.
[13] Incheon Paik and Hiroshi Mizugai. Recommendation
7. ACKNOWLEDGEMENTS system using weighted tf-idf and naive bayes classifiers
The authors would like to thank the WePlay team for the on rss contents. JACIII, 14:631–637, 2010.
data and allowing us to conduct evaluation of our algorithms [14] Rapeepisarn K. Paireekreng, W. and K.W. Wong.
on the live app. Personalised mobile game recommendation system.
6th International Game Design and Technology
8. REFERENCES Workshop and Conference, 2008.
[1] Andrew I. Schein et al. Methods and metrics for [15] F. et al. Ricci. Recommender systems handbook,
cold-start recommendations. Proceedings of the 25th chapter Content-based Recommender Systems: State
annual international ACM SIGIR conference on of the Art and Trends. Springer, Berlin, 2011.
Research and development in information retrieval [16] Bell R.M. and Koren Y. Improved neighborhood based
(SIGIR ’02), pages 253–260, 2002. collaborative filtering. KDD 2007 Netflix Competition
[2] Noam Koenigstein et al. The xbox recommender Workshop, 2007.
system. Proceedings of the sixth ACM conference on [17] Christopher D Manning Sepandar D Kamvar, Taher
Recommender systems (RecSys ’12), pages 281–284, H Haveliwala and Gene H Golub. Extrapolation
2012. methods for accelerating pagerank computations.
[3] Pavle Skocir et al. The mars - a multi-agent WWW ’03 Proceedings of the 12th international
recommendation system for games on mobile phones. conference on World Wide Web, pages 261–270, 2003.
KES-AMSTA’12 Proceedings of the 6th KES
international conference on Agent and Multi-Agent
12