Clustering Enhancement for a Token-Based Recommender Ville Ollikainen VTT Technical Research Centre of Finland 02150 Espoo, Finland ville.ollikainen@vtt.fi values, “tokens”, in user-item transactions. This enables distributed recommendations in a multi-player Abstract environment, based on bilateral communications between a user and a service. As such, while also providing easy This paper introduces a clustering enhancement scalability, the approach relates to multi-domain to an established token-based collaborative collaborative filtering proposed by [Zha12] and cross- recommendation method (“upcv”). The method domain recommendations proposed by [Gao13], creates privacy-protecting abstractions for users overcoming sparsity problems that are often experienced and items by exchanging and collecting in single domain collaborative recommenders. Privacy randomly generated N-bit values, “tokens”, in properties of the method have been presented in [Oll16]. user-item transactions. The novel enhancement The upcv recommender has been in public use in considers users’ random value spaces as Helsinki Metropolitan area libraries since 2014. hyperspaces in which the tokens are N- Available online, it has currently 600,000 patrons and dimensionally clustered. Instead of selecting actively covers 300,000 book titles. exchanged tokens at random, as in the baseline Some approaches, such as [Hua15], acknowledge that upcv, tokens are now selected from a cluster, users can hold multiple interests and items may belong to which has the best match with item’s token multiple categories. The same topic from context point of collection. Recommendation quality is evaluated view is addressed in [Sap16], suggesting that it is also with the same 3.5% density data set as in a important to incorporate the contextual information into previous publication. The quantitative analysis the recommendation process. [Bin12] in turn introduced indicates overall improvement in multi-class co-clustering for the purpose. Indeed, human recommendation quality while learning time life is versatile, and recommenders should respect this: decreased without exception, up to one-third. Although one person may be interested in cooking and There was improvement even when the number motorcycles and another interested in cooking and of exchanged tokes was exactly one, instead of gardening, the recommender should not end up over 100 in the baseline upcv. The performance associating motorcycles with gardening. improvement may be explained by the clustering The rest of this paper is structured as follows: Section enhancement inherently recognizing versatility 2 provides preliminaries about the token-based of each individuals’ interests. The paper also recommender. In Section 3, we introduce unsupervised presents a study with news data set, where the N-dimensional clustering of tokens. Results are presented improvement was in coverage. in Section 4 and concluded with discussions in Section 5. 1 Introduction 2 Preliminaries This paper is based on a collaborative token-based 2.1 A Token-Based Recommender (“UPCV”) recommendation method[Oll13, Oll17], which creates The method associates both users and items with privacy-protecting abstractions for users and items by collections of tokens, each token carrying a random value. exchanging and collecting randomly generated N-bit Interaction between user and item triggers selected ——————————————— tokens to be copied from the token collection of the user Copyright © CIKM 2018 for the individual papers by the papers' to the token collection of the item, and vice versa. In authors. Copyright © CIKM 2018 for the volume as a collection previous papers[Oll13, Oll16], the maximum size of by its editors. This volume and its papers are published under the Creative Commons License Attribution 4.0 International (CC BY 4.0). collections has been 1024 tokens, from which up to 15% In step b), Alice is accessing Item X, triggering a token have been randomly selected for the exchange. exchange procedure; copying tokens from the user to the When the same user interacts with several items, or the item and vice versa. In this step, both Alice and Item X same item is involved in interactions with several users, have only one token; these are the exchanged tokens tokens are spread around, resulting in statistical resulting step c), in which both Alice and Item X do have similarities among different token collections in the common tokens. (In the figure bold & italics font system. Since tokens are copied in user-item interactions highlights the most recently acquired tokens.) only, it is likely that similarities between two token Next, Bob is accessing the same item X in the Service collections originate from similar user behavior. The 1, and once again, a couple of tokens are copied over (step method is collaborative by nature and requires no content d). It should be noted that this time Item X is able to analysis. provide more than one token. After Bob’s action, it Collections are dynamic by nature and, unlike cookies should be noted that also Bob and Alice have similar and other tracking means, tokens have no persistent tokens, as can be seen in step e). associations with real world. In particular, they do not Still in e): Bob is accessing Item Y in the Service 2. A have any association with persons: tokens are mere couple of tokens are requested for exchange; since Bob random values that, over time, will be copied to and has more than one token to give, some tokens are picked deleted from collections. randomly from his collection. The following example[Oll17] in Figure 1 explains As the last action in this example, Alice requests a how tokens are able to provide recommendations. recommendation for herself from the Service 2, which she is now visiting the very first time (step f). Finally, Service a) 2 goes through all available items and compares their tokens with the provided tokens. It is likely that Item Y will be in the recommendation list, since there is a token b) in common. For a similarity measure in general, previous papers [Oll13, Oll16, Oll17] suggest using Jaccard index. 2.2 Book Club Data Set and a Previous S tudy c) The recommendation method was first introduced in [Oll13], using a data set collected from book club members of Bonnier Books Finland in 2013. This d) publication presents some details of the data set. Concisely, book club members (users) were asked, without limitation, which books (items) they had read and liked from a collection of 1041 books. 1575 members e) responded, of which 1532 selecting at least one book, providing 55434 individual selections total. Hence, density was 3.5%. 18 users selected exactly one book. The selections were converted into user-item pairs f) (“transactions”), shuffled into random order and divided into two subsets, each consisting of 27717 user-item pairs in random order. The first subset was used as training data, while the second subset remained for validation. In Figure 1: Example of Two Users and Two Services; the baseline study[Oll13], exchanged tokens were Recommendation for Alice at Her First Visit in Service 2 selected randomly while a maximum of 15% of token collections were exchanged in each transaction. When a The step a) in Figure 1 illustrates two independent token collection reached its maximum size (1024 tokens), services on the top, the upper service (“Service 1”) tokens were deleted at random to make space. having just a single item (“Item X”) and the lower service The quantitative assessment was focusing on a (“Service 2”) having a plurality of items, including “Item practical question: how long should a recommendation Y”. Two users, “Alice” and “Bob”, are presented for list be, in order to have at least one successful simplicity. In the beginning, each user and item has a recommendation, i.e. a book that a user has selected single random number, “token”, in their collections. exists in the validation subset. Tokens are illustrated as 823, 463, 71 and 343. dimension. A generic illustration of a hypercube is presented in Figure 3: An N-dimensional hypercube can be created by adding a copy of an N-1 dimensional hypercube with edges connecting respective corners. In the illustration, the corners are numbered in such a way that each dimension adds one most significant bit, in the original hypercube the bit value being 0 and in the copy 1. Figure 2: Median Length of a Successful Recommendation List (vertical axis, with 10%- 90% intervals) for Users With Specific Number of Books in Training Data (horizontal axis) [Oll13] Figure 3: N-dimensional Hypercube Is a Doubled Copy of Figure 2 (above) from the baseline study illustrates that an N-1-Dimensional Hypercube with an Edge Between shorter recommendation lists were adequate for users the Original (grey) and the Copied (black) Corner who had a higher number of books selected in the training In the novel concept, each token represents one corner data [Oll13]. For example, half of the users having ~28 in the hypercube, while a token collection is a set of books in the training data got at least one successful corners, respectively. recommendation in a list with five books. The figure also Tokens in each individual collection (separately) were illustrates intervals that exclude the upper and lower 10% then clustered with bisecting K-means, recursively of the users in each group, and a best matching trend line. splitting a hypercube into clusters and sub clusters each 2.3 News Portal Data S et Studies time by applying a cutting hyperplane through a previous cluster, until a pre-defined maximum number of clusters “Ilkka” news portal dataset consists of 2123 users, 2439 was reached or the remaining sub clusters were dense news articles and 35891 news clicks in chronological enough. order over a period of 30 days. Success was evaluated by For each cluster, we pick a best representing token, a checking if a user eventually clicked a recommended “center token”. When tokens are exchanged, the “item” news article. In recommendations, each article had a 24- discloses its center token and the “user” activates the hour active lifespan from its first click. Training period cluster, which is closest to it. Only the tokens in the active was 7 days individually, beginning at each user’s first cluster take part in token exchange. If the collection click. Recommendations were created and evaluated at reached its maximum size, those tokens that are farthest each click, excluding articles already clicked by the user. away from any cluster were deleted first. With these parameters, recommendation lists with at least If an already existing token was received, a new token one success were possible in 14552 clicks (hereinafter: was created as a randomly selected neighboring corner 100% ‘Click Coverage’). Maximum token collection size (out of 24), making the particular cluster more dense. was 256; otherwise, the setup was the same as in Chapter A 24-dimensional hypercube has 224 ≈ 16 million 2.2, with 15% token exchange (i.e. max. 38). corners, being capable to accommodate a relatively high number of clusters reflecting a multitude of our interests. 3 Methodology 3.2 New Similarity Metric 3.1 Clustering of Token C ollections Instead of Jaccard [Oll13], a new similarity metric (1) is The motivation behind clustering is, that a recommender introduced in this paper to take into account cluster should not mix multiple interests, such as “cooking”, densities, while not requiring exact matches as in Jaccard. “motorcycles” and “gardening”, as mentioned in the It sums up, how well each token in collection A matches Introduction. the other collection B: As a novelty, we introduce now a conceptual view of tokens as points in a hyperspace by converting token’s numeric value values into N-bit representation (e.g. N=24), resulting an N-dimensional binary hypercube. where A and B are token collections and dHAD is Each token represents one corner in this hypercube, each Hamming distance (i.e. number of differentiating bits). bit value (0 or 1) defining its projection in the respective The new similarity metric improves coverage, since it does not require exact matches between token Exchanging 3 tokens provided fastest learning: For collections. As a drawback, it is computationally heavier. instance, half of the users having ~8 books in training data got a successful recommendation in a list of mere five 3.3 Experiment books; the previous study indicated similar performance The experiment was carried out as in [Oll13], with the for users with ~25 books in training. same data and other parameters when applicable, in order Table 1 presents respective recommendation coverage to compare the effectiveness of clustering. The maximum and number of successful recommendations (out of 1523 number of clusters for any single user was set to four, after users) in the book club experiment. It is notable that some preliminary experiments. Items had a single cluster recommendations were most diverse when only 1 token (i.e. no clustering). The number of exchanged tokens was was exchanged: 31% of all books were in top-5 reduced to 1, 3 and 10 tokens, however with an additional recommendations at least once. The number of condition that the number of received tokens was never unsuccessful recommendations were ~10 (i.e. 99.3% allowed to exceed the number of already existing tokens. success): failures typically related to users with one rare Tokens that were closest to cluster centers were selected book in training data. Only two exceptions were for exchange, instead of random picks used in the baseline observed: users with two and three such classic books (at method. 1 and 3 token exchange, respectively) in training data that In the book experiment, the quantitative assessment could be classified as bestsellers. was similar as in [Oll13]. Recommendations were given only to those users that were in the training data, Table 1: Recommendation Coverage and Number of containing 1523 users; some users having only one book Successful Recommendations in the Book Club selected had that book not in training but in validation Experiment data. A second experiment was carried out with the news portal data set, comparing results of non-clustered and clustered methods with recommending random (with Monte Carlo evaluation) and most popular articles. The clustering method with 3 token exchange was compared As can be expected, the average number of tokens in to the baseline method as described in Chapter 2.3. each users’ collections varied according to the number of exchanged tokens: e.g., when exchanging exactly 1 token 4 Results at a time, a collection also grows by one at most. While the collections had different number of tokens, the As Figure 4 illustrates, clustering reduced learning time number of clusters did not substantially vary. (# of books in training data) to up to one-third, while the Figure 5 below illustrates proportions of users that overall quality was invariably improved. As found in would get successful recommendations with given length [Oll13], the method seems to be not critical of the number of a recommendation list, when exchanging 3 tokens. of tokens exchanged: there was notable improvement even when exchanging only one token at a time. Figure 5: Length of a Recommendation List When Exchanging 3 Tokens in Each Transaction; the Figure 4: Median Length of a Successful Proportion of Users Getting Successful Recommendation List With Clustering Token Recommendations at Given Length of a Collections, Exchanging Max 1, 3 and 10 Tokens in Each Recommendation List Transaction; all Outperform the Baseline Finally, Table 2 resents quantitative results of the news References recommendation experiment. The baseline method outperformed ‘most popular’ recommendations, but had [Bin12] Bin Xu, Jiajun Bu, Chun Chen, and Deng Cai. a reduced click coverage. Compared to the baseline, the 2012. An exploration of improving collaborative clustering method fell behind in recommendation quality recommender systems via user-item subgroups. but in turn provided full click coverage (c.f. 2.3). 3 tokens In Proceedings of the 21st International were exchanged, versus max. 38 in the baseline. Conference on World Wide Web. ACM, 21–30. Table 2: Results of the News Recommendation [Gao13] Gao, S., Luo, H., Chen, D., Li, S., Gallinari, P., Experiment & Guo, J. (2013). Cross-domain recommendation via cluster-level latent factor model. Lecture Notes in Computer Science, 8189 LNAI(PART 2), 161–176. doi:10.1007/978-3-642-40991-2_11 5 Conclusions and Discussion [Hua15] Huang, S., Ma, J., Cheng, P., Wang, S. (2015) A Hybrid Multigroup Coclustering This paper was focusing on comparing the results with a Recommendation Framework Based on previous baseline study[Oll13], with the same data set Information Fusion. ACM Trans. Intell. Syst. and evaluation metrics. As a novelty, unsupervised N- Technol. 6, 2, Article 27 (March 2015), 22 dimensional (N=24) clustering was applied to each users’ pages. DOI=http://dx.doi.org/10.1145/2700465 token collection individually, and only one cluster at a time was selected for token exchange. Each user had a [Oll13] Ollikainen, V., Mensonen, A., Tavakolifard, M. maximum on four clusters. (2013) UPCV Distributed recommendation In book club study this enhancement improved system based on token exchange. Journal of recommendation quality, while it at the same time Print and Media Technology Research, Vol. 2, reduced learning time. A previous publication[Oll13], No. 3, pp. 195 – 201 available at www.vtt.fi/inf/ reported that the method seems stable while varying the julkaisut/muut/2013/OA_JPMTR_1314_Ollikai percentage of exchanged tokens. Clustering makes no nen.pdf difference: The results remained better in all cases, when [Oll16] Ollikainen, V., Niemi, V. (2016) Privacy the number of exchanged tokens was reduced from Analysis of a Networked Collaborative maximum of over 100 (15% out of 1024) to 1, 3 and 10. Recommendation System. International Journal Recommendations were diverse: About one-fourth of of Humanities and Management Sciences the entire 1041 book collection were presented in Top-5 (IJHMS) Volume 4, Issue 4 (2016) ISSN 2320– recommendations. The evaluation suggested the best 4044 (Online). results with 3 token exchange. It is in further studies to optimize token exchange with different data sets. [Oll17] Ollikainen, V., Niemi, V. (2017) Evaluating the Tokens provide privacy-protecting abstraction. In Performance and Privacy of a Token-Based certain arrangements, a single token exchange will enable Collaborative Recommender. In Proceedings of privacy, since it solves the remaining returning-user WI ’17, Leipzig, Germany, August 23-26, 2017, privacy issue introduced in [Oll16]. 5 pages. In the news recommendation experiment, quality http://dx.doi.org/10.1145/3106426.3109434 improvement was less clear. Although the [Sap16] Sappelli, M., Verberne, S., & Kraaij, W. (2016). recommendation quality was close to popularity even Evaluation of context-aware recommendation with full coverage, a tradeoff between coverage and systems for information re-finding. Journal of the recommendation quality may exist: also worst cases get Association for Information Science and recommendations with full click coverage, adversely Technology, 68(4), 895-910. affecting quality evaluation. Further studies could doi:10.1002/asi.23717 quantify this phenomenon. While the N-dimensional clusters can be assumed to [Zha12] Zhang, Y., Cao, B., & Yeung, D.-Y. (2012). reflect multiple interests of users, the true relation Multi-domain collaborative filtering. arXiv between clusters and multiple interests, perhaps utilizing Preprint arXiv:1203.3535. newspaper sections, would be worth studying, together with optimizing the number of clusters.