An Adaptive Technique for Weighting Multiple Factors in Followee Recommendation Algorithms Antonela Tommasel and Daniela Godoy ISISTAN Research Institute, CONICET-UNCPBA, Tandil, Buenos Aires, Argentina Abstract Then, percentages are used as the similarity weights that will The accurate suggestion of interesting friends arises as a be further updated according to user preferences. crucial issue in recommendation systems. This work argues Updating Factor Weights. The computed weights are used for that the criteria for recommending friends (or followees) needs assessing the similarity between each potential followee and to be adapted and combined according to each user’s prefer- the target user in the recommendation process. The target user ences. A technique is proposed for adapting such criteria to the is presented with the set of most similar potential followees. characteristics of previously selected followees. Experimental For each accepted followee, i.e. each potential followee the evaluation showed that the technique improved the precision target user has accepted or manifested interest in, weights are of static weighting strategies. Results highlighted the import- updated to reflect the new interests of the target user. ance of adapting to changes in user preferences over time. Ranking Recommended Followees.In standard similarity- 1 Introduction based algorithms, as all recommended candidates are similar Online social networks have an important place in the life to the target user, they are likely to be similar to each other. of millions of users who actively use them for finding new Thus, such algorithms will never uncover certain items, which friends. The decision to start following other users simul- although less similar to the target user, are also important [Hur- taneously attends to several reasons, which might differ for ley and Zhang, 2011]. Consequently, it would be desirable to each individual user. Thus, understanding how users select include novel or diverse items in the recommended list. Nov- followees emerges as a key design factor of strategies to per- elty could be introduced to similarity-based algorithms aiming sonalise recommendations. Interestingly, most followee selec- at balancing both, the relevance of candidate followees (i.e. tion approaches are only based on equally important and in- its similarity to the target user) and the diversity of recom- dependent factors, disregarding how users’ interests can affect mendations. Novelty can be measured in terms of the degree the followee selection. This work argues that followee recom- to which is unusual regarding the target user normal interests mendation criteria needs to be personalised according to users’ (i.e. the previously selected followees). It can be computed abs(Similarity(u,i)−Similarity(u,p f )) preferences. A technique is proposed for adapting such cri- ∑ as i∈ f ollowees(u) | f ollowees(u)| , where u represents the teria to each user considering the characteristics of previously target user, p f represents the potential followee, f ollowees(u) rep- selected followees. resents the previously selected users of u and Similarity is the 2 Related Work overall similarity. If previously selected followees are similar Several approaches have proposed to suggest interesting to the target user, and the new potential followee is dissimilar users in social networks based on a unique and independ- to the target user, he/she will also be dissimilar to previously ent factor [Golder and Yardi, 2010; Hannon et al., 2010]. selected followees. The higher the absolute differences, the Approaches that combine several factors assume that they higher the dissimilarity, and thus the novelty introduced. Con- are equally important to each user by averaging or multiply- sequently, the novelty of a potential followee can be assessed ing them [Armentano et al., 2011]. Closely related to this without computing the actual dissimilarity between the poten- work, Agarwal and Bharadwaj [2013], and Garcia and Amatri- tial followee and each previously selected followee. ain [2010] personalised factors’ weights. However, unlike the Finally, the potential followees are ranked by considering technique proposed by this work, changes over time in user the linear combination of relevance and novelty. The weight preferences were not considered for adapting the weights. of the novelty is computed as the percentage of the previously 3 An Adaptive Technique for Personalising Fol- selected followees for whom the novelty score was higher than lowee Recommendation a pre-defined threshold. Similarly, the weight of the relevance The technique suggests a list of interesting followees by is computed as the percentage of the previously selected fol- optimally combining different recommendation factors. The lowees for whom novelty was lower than the threshold. Both combination is particular to each user as it is based on his/her weights are updated as previously described. preferences reflected on previously selected followees. 4 Experimental Evaluation Computing Factor Weights.The overall similarity between This section presents the experimental evaluation performed users u and v (Similarity(u,v)) can be defined as a linear com- to assess the effectiveness of the proposed technique. bination of the similarity for each followee recommendation Factors for Followee Recommendation. Although the factor (simi (u,v)) and its corresponding weights (αi ) as follows: presented technique could be applied to any arbitrary number ∑ni=1 αi ∗simi (u,v). As recommendation systems aim to find the of recommending factors, this work focuses in the two main most similar potential followees, factors’ weights (αi ) should followee recommendation factors: topology and content. accurately capture user preferences. Thus, they are defined by Topology. Most link prediction algorithms are based on net- considering the characteristics of the previously selected fol- work topology. The number of common followees is one of the lowees. Followees are assumed to be chosen by a determined most common metrics applied to Twitter network. It can be factor if their similarity with the target user for such factor is higher than a pre-defined threshold. The preference of users defined as |ΓΓout out (x)∩Γout (y)| (x)∪Γout (y) , where x and y are nodes (i.e. users), regarding each factor is computed as the percentage of fol- kx is the degree of node x, and Γ(x), Γout (x) and Γin (x) are the set lowees for whom the similarity is higher than the threshold. of neighbours, followees and followers of x , respectively. Figure 1: Comparison of Precision Results Content. The interest of a user can be characterised Regarding the proposed technique, the adaptive-no-novelty by profiles based not only on the information they create achieved the worst results. As a result, although the combin- and publish (publishing profile), but also on the informa- ation of weights is adapted to each user, it is not sufficient tion they consume (reading profile), for example the retweets. for further improving results. Also, it can be inferred that al- The publishing profile of user u j is built by considering though users have a particular preference for a certain type of all of the user tweets (tweets(u j )), which can be defined as: followees, they also select followees that do not exactly match pub−pro f ile(u j )=tweets(u j ). The reading profile of a user u j can such preferences. Consequently, the search and ranking of be defined as: read−pro f ileRT (u j )=tweetsRT (uk ) ∀k∈ f ollowees(u j ). The users should not be only guided by similarity, but also by nov- similarity between the reading profile of a user and the pub- elty. Adding novelty (adaptive) improved the best baseline. As lishing profile of their potential followees is assessed using the the figure shows, the adaptive alternative was able to achieve cosine similarity. an optimal precision after 26 weights updates. These results Experimental Settings.To evaluate the performance of the evidenced the importance of recommending both similar and proposed technique, potential followees were ranked and the novel followees. Finally, it is also shown the precision stability top-N users were selected. For each user, their actual fol- once the preferences of users are learned and adapted. lowees and a equal proportion of randomly selected non- Regarding the differences between the weights predicted by followees were added to the pool of potential followees to be the technique, and the real preferences of the target users), recommended. To simulate the actual behaviour of target users the absolute differences were below 0.1 for the 76% of target over time, actual followees were added to the pool of potential users, highlighting the usefulness of the proposed technique followees in the same order in which the user started following not only for adequately capturing users’ interests, but also for them. adapting to the changes in user preferences over time. The proposed technique (adaptive) was compared against In summary, precision of recommendations can be im- three static baselines: pure-topology, pure-content and half- proved when considering an adaptive technique for defining topology-content. Additionally, adaptive was compared to a the weights of the recommendation factors. Results emphas- version that ignores the novelty factor: adaptive-no-novelty. ised the importance of adapting the relevance or weights of The quality of recommendations was evaluated by selecting the factors to changes in user preferences over time, and also a ranked sub-set of the potential followees and computing the considering diversity in followee recommendations. overall precision immediately after the weights were updated. 5 Conclusions As there is no explicit feedback from target users available, the This work proposed a technique for adapting the followee evaluation assumes that items that were not originally part of selection criteria to the decisions of each particular user re- the followee set are uninteresting for the user. This assumption garding the characteristics of his/her previously selected fol- might not be completely accurate as recommended users might lowees. Experimental evaluation showed that the proposed not be selected simply because the user was unaware of them. technique helped to improve precision results regarding static As a result, precision might be underestimated. weighting strategies. Furthermore, results highlighted the im- The pool of potential followees comprised 20 users, out of portance of adapting to the changes of the user preferences which 10 were recommended to the user. Factors’ weights over time. were updated after 10 accepted recommendations. Initially, References the technique assumes that no followee has been selected. [Agarwal and Bharadwaj, 2013] V. Agarwal and K. K. Bharadwaj. Thus, all factors are assigned equal weights. The minimum A collaborative filtering framework for friends recommendation similarity threshold was set to 0.7 for the content-based factor, in social networks based on interaction intensity and adaptive user and to 0.2 for the topology factor. The novelty threshold was similarity. Social Netw. Analys. Mining, 3(3):359–379, 2013. set to 0.05. [Armentano et al., 2011] M. Armentano, D. Godoy, and A. Amandi. A topology-based approach for followees recommendation in Dataset.The dataset was obtained by crawling a set of 3,453 Twitter. In Proceedings of the ITWP at 22nd IJCAI, pages 22– target users listing the language account as English, and hav- 29, 2011. ing at least 10 followees and 10 published tweets. All user [Garcia and Amatriain, 2010] R. Garcia and X. Amatriain. Weighted information was retrieved by means of the TwitterAPI1 . content based methods for recommending connections in online Experimental Results. Figure 1 shows the evolution of the av- social networks. In Proceedings of the 2nd RSWeb, pages 68–71, Barcelona, Spain, 2010. erage recommendation precision for the first 70 weights up- [Golder and Yardi, 2010] S. A. Golder and S. Yardi. Structural pre- dates performed. As regards the baselines, the best results dictors of tie formation in twitter: Transitivity and mutuality. In were achieved when considering the pure-content alternative, Ahmed K. Elmagarmid and Divyakant Agrawal, editors, Social- which achieved a precision higher than 0.95, with differences Com/PASSAT, pages 88–95. IEEE Computer Society, 2010. up to a 58% regarding the worst baseline (pure-topology). [Hannon et al., 2010] J. Hannon, M. Bennett, and B. Smyth. Recom- These results indicated that although the majority of the fol- mending Twitter users to follow using content and collaborative lowee relations were content driven, there were also followee filtering approaches. In Proceedings of the 4th ACM Conference relations that were not found with a pure content oriented RecSys, pages 199–206, 2010. strategy. Topology-based results further highlighted the fact [Hurley and Zhang, 2011] N. Hurley and M. Zhang. Novelty and that the majority of the followee relations are content driven. diversity in top-n recommendation – analysis and evaluation. ACM Trans. Internet Technol., 10(4):14:1–14:30, March 2011. 1 https://api.twitter.com