-

Context-Regularized Neural Collaborative Filtering for Game App Recommendation

Shonosuke Harada∗

sh1108@ml.ist.i.kyoto-u.ac.jp 1

Makoto Yamada

myamada@i.kyoto-u.ac.jp 2

Kazuki Taniguchi

taniguchi_kazuki@cyberagent.co.jp 0

Hisashi Kashima

kashima@i.kyoto-u.ac.jp 2 0 CyberAgent, Inc. , Tokyo , Japan 1 Kyoto University , Kyoto , Japan 2 Kyoto University/RIKEN AIP , Kyoto , Japan

2019

People spend a substantial amount of time playing games on their smartphones. Owing to growth in the number of newly released games, it is geting more dificult for people to identify which of the broad selection of games they want to play. In this paper, we introduce context-aware recommendation for game apps that combines neural collaborative filtering and item embedding. We find that some contexts special to games are efective in representing item embeddings in implicit feedback situations. Experimental results show that our proposed method outperforms conventional methods.

are popular and successfully developed particularly in E-commerce services including YouTube [ 4 ], Netflix [ 1 ], and Amazon [ 11 ], to name a few.

Recommendation systems basically focus on predicting each user’s preference for each kind of item. Collaborative filtering [ 13 ] is a widely used personalized recommendation method that recommends a new item using past user–item interactions. A typical collaborative filtering algorithm would be based on matrix completion [ 8 ], which decomposes a user–item matrix into user latent features and item latent features. For a long time, matrix completion algorithms based on factorization algorithms have been the first choice in recommender systems [ 8 ].

Recently, deep learning approaches [ 15–17 ] have gathered appreciable atention in the recommender systems community. However, a collaborative denoising autoencoder (CDAE) [ 17 ] could not improve its performance even if they use non-linear activation function and deeper models. One of the reason would be that CDAE equals to SVD++ when the identity function is used as an activation function and applies a linear kernel to model user-item interactions. [ 5 ].

To handle this issue, the neural collaborative filtering (NCF) [ 5 ] has been proposed, which was the first successful deep-learning-based collaborative filtering algorithm. NCF employs a simple neural network architecture consisting of only multi-layer perceptrons and a generalized matrix factorization (GMF). Thanks to its simplicity, it can train deep learning models without overfiting and, surprisingly, outperforms state-of-the-art collaborative filtering using only user–item information.

Another successful collaborative filtering algorithm is based on word embedding [ 10 ]. More specifically, pointwise mutual information (PMI) [ 3 ], which is computed from the item–user matrix, is used as a regularizer in addition to the matrix completion loss function. Thanks to PMI regularization, we can embed a similar item pair into a similar location at a latent space; this helps significantly to train deep learning models eficiently. This approach is promising. However, to the best of our knowledge, no deep-learning-based approaches have been put forward.

In this paper, we propose the context-regularized neural collaborative filtering (CNCF), which enjoys the representation power of deep learning and can be eficiently trained thanks to PMI-based regularization. Specifically, we naturally combine NCF and PMI regularization [ 10, 14 ], in which item latent vectors are shared in both NCF and PMI-based embedding. Thanks to its simplicity, CNCF can be eficiently trained using a standard deep learning package. Through experiments on real-world game app recommendation tasks, the proposed method significantly outperforms the vanilla NCF, which is a state-of-the-art recommender algorithm.

PROBLEM FORMULATION

Let Y ∈ RM×N be the user-item (game app) matrix whose elements are yi j = 1 if the user i installed game app j and 0 otherwise. Let M and N be the number of users and the number of items (game apps), respectively. This is a standard implicit feedback seting. If yi j = 1, it means that user i installed sjm vgS5ojWMvnNmyfT+PU7XOd2BZ8ˆjT vˆm l<atexish1_b64="J3QkzwrpFKYuAEIqV0>CHc9LRDG/ vYWLXNE+MjgKZ90wz8ru3nJ5˜jT v˜m l<atexish1_b64="UycQVkmTPqvDpOfdI>ACHG7SRBF2/o ŷ ij

NeuMFLayer y ij MLPLayer MLPUser timestamp, the last login timestamp, and the paid flag, respectively. To use this information for recommendations, we generate a time-dependent matrix T ∈ RM×N , where t˜i j is the diference ˜ ti j is later transformed into normalized dwell time [ 18 ]. Moreover, for the paid flag information, we extract the paid matrix P ∈ RM×N , where ri j is 1 if the user u pays money for item i and 0 otherwise.

The final goal of this paper is to build a recommendation model for user-item matrix Y using the user-item matrices Y , T , and P .

PROPOSED METHOD (CONTEXTUAL NCF) In this section, we propose the contextual neural collaborative filtering (CNCF), which is an extension of the widely used NCF algorithm [ 5 ].

Model: The following model with one perceptron layer is used:

yˆi j = σ (h⊤(uˆi ⊗ vˆj ⊕ a(W u˜i ⊕ v˜ j )), where ⊗, ⊕, and h, a indicate the element-wise product and the concatenation of the two embeddings, edge weights of the output layer as well as an activation function like Relu, respectively. Figure 1 shows the model architecture of CNCF. GMF layer indicates the element-wise product of two embeddings and, in the PMI layer, we compute the inner product of both MF and the MLP j-th item embedding along with the MF and MLP all item embedding. uˆ,u˜ ,vˆ,v˜ denote MF and MLP user embedding and MF and MLP item embedding, respectively. CNCF consists of generalized matrix factorization and multilayer perceptrons. lfag approach P as Using context information: As contextual features, we use time-dependent features T and a paid Lcontext = ( Í(i, j)∈Y∪Y− (1+αti j )yi j logyˆij +(1−yij)logyˆij, (time − NCF) Í(i, j)∈Y∪Y− (1+βri j )yi j logyˆij + (1−yij)logyˆij, (paid − NCF) where α ≥ 0 and β ≥ 0 are tuning parameters and Y is the set of indices of non-zero elements in Y and Y− is the set of indices of zero-elements in Y . In implicit feedback settings, to address the problem of lacking negative data, treating all unobserved data as negative feedback [ 6 ] or negative sampling from unobserved data [ 5 ] are popular strategies. Note that we sampled user-item pairs as negative interactions from the unobserved interaction set. Regularization based on Pointwise Mutual Information (PMI) In this paper, in addition to contextual information, we introduce an embedding structure to NCF since it helps to improve prediction accuracy [ 10, 14 ]. In particular, we employ the GloVe-based embedding approach [ 12 ]. The loss function of a GloVe can be written as

J = Õ j,m sjm − vj⊤vm 2 , where sjm is some similarity measure between item j and item m. In this study, we employ positive pointwise mutual information (PPMI) [ 9 ] as a similarity measure: p(x, y) ,

PPMI(x, y) = max(PMI(x, y), 0), PMI(x, y) = log2 p(x )p(y) where p(x ) denotes the probability of users installing a game app x and p(x, y) denotes the probability that users install a game app x and y. Finally, the loss function of the context-regularized NCF is given as where λ ≥ 0 is the regularization parameter.

EXPERIMENTS

We gathered game app click information from a commercial game app company. Figure 2 shows some examples of game apps. Then, we used 100,000 users who had installed over 20 game apps and played one of their games within last two years. The number of games was 725 (i.e., Y ∈ R100000,725), and the number of non-zero entries was 2,854,328.

We implemented all methods using Pytorch and ran experiments using a Tesla P100. We set the learning rate as 0.001 and the batch size as 1024. Then, we used Adam [ 7 ] as the optimizer. For the regularization parameters of CNCF, we used α =0.01, β=0.1, and λ = 1. For all experiments, we set the number of multi-layer perceptrons as four and the number of latent feature representations as 64. The initial model parameters were randomly initialized. We set the negative sampling ratio as 2 that means we sample 2 unobserved interactions as negative samples per one observed interactions for every user.

To evaluate the performance of the item recommendation, we used the leave-one-out scheme, which has been widely used in the relevant literature [ 5 ]. As evaluation metrics, we adopted HitRatio (HR) and normalized discounted cumulative gain (nDCG), which are also popular in recommendation tasks [ 5 ].

Figures 3 to 6, we show the results of the proposed and the existing methods. As can be seen, the proposed contextual NCF compares favorably with the existing state-of-the-art algorithms. (1) 10000 15000 20000 25000 30000 35000 40000 45000 50000

the number of people

0.70 10000 15000 20000 25000 30000 35000 40000 45000 50000

the number of people 0.84 0.82 00.80 1 ito@0.78 a itR0.76 H0.74 0.72 0.815 0.810 00.805 1 ito@0.800 aR itH0.795 0.790 0.785 0.75 0.70 0.55 0.74 0.73 010.72 G@0.71 C nD0.70 0.69 0.68 0.50 10000 15000 20000 25000 30000 35000 40000 45000 50000

the number of people 0.67 10000 15000 20000 25000 30000 35000 40000 45000 50000

the number of people ItemPop GMF NCF NCF+PMI NCF+time NCF+paid NCF+all NCF NCF+time NCF+paid

[1]

James

Bennett ,

Stan

Lanning , et al. 2007 . The netflix prize . In KDD cup and workshop.

[2] Heng-Tze

Cheng

, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson , Greg Corrado, Wei Chai, Mustafa Ispir , et al. 2016 . Wide & deep learning for recommender systems . In the 1st Workshop on Deep Learning for Recommender Systems.

[3]

Kenneth

Ward Church and

Patrick

Hanks . 1990 . Word association norms, mutual information, and lexicography . Computational linguistics 16 , 1 ( 1990 ), 22 - 29 .

[4]

James

Davidson , Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi , Sujoy Gupta, Yu

, Mike Lambert,

Blake

Livingston , et al. 2010 . The YouTube video recommendation system . In RecSys.

[5]

Xiangnan

He , Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua . 2017 . Neural collaborative filtering . In WWW.

[6]

Yifan

Hu , Yehuda Koren, and

Chris

Volinsky . 2008 . Collaborative filtering for implicit feedback datasets . In ICDM.

[7] Diederik

Kingma and Jimmy Ba . 2014 . Adam: A method for stochastic optimization . arXiv preprint arXiv:1412.6980 ( 2014 ).

[8]

Yehuda

Koren . 2008 . Factorization meets the neighborhood: a multifaceted collaborative filtering model . In KDD.

[9]

Omer

Levy and

Yoav

Goldberg . 2014 . Neural word embedding as implicit matrix factorization . In NIPS.

[10] Dawen

Liang

, Jaan Altosaar, Laurent Charlin, and David M Blei. 2016 . Factorization meets the item embedding: Regularizing matrix factorization with item co-occurrence . In RecSys.

[11] Greg

Linden

Brent

Smith , and Jeremy York. 2003 . Amazon. com recommendations: Item-to-item collaborative filtering . IEEE Internet computing 1 ( 2003 ), 76 - 80 .

[12] Jefrey

Pennington

, Richard Socher, and

Christopher

Manning . 2014 . Glove: Global vectors for word representation . In EMNLP.

[13] Badrul

Sarwar

, George Karypis, Joseph Konstan,

and John

Riedl . 2001 . Item-based collaborative filtering recommendation algorithms . In WWW.

[14] Thanh

Tran

Kyumin

Lee ,

Yiming

Liao , and

Dongwon

Lee . 2018 . Regularizing Matrix Factorization with User and Item Embeddings for Recommendation . In CIKM.

[15] Aaron

Van den Oord

, Sander Dieleman, and

Benjamin

Schrauwen . 2013 . Deep content-based music recommendation . In Advances in neural information processing systems . 2643 - 2651 .

[16] Hao

Wang

Naiyan

Wang , and Dit-Yan Yeung . 2015 . Collaborative deep learning for recommender systems . In KDD.

[17] Yao

, Christopher

DuBois

, Alice X Zheng , and Martin Ester . 2016 . Collaborative denoising auto-encoders for top-n recommender systems . In WSDM.

[18] Xing

, Liangjie Hong, Erheng Zhong, Nanthan Nan Liu, and

Suju

Rajan . 2014 . Beyond clicks: dwell time for personalization . In RecSys.