A Half-Life Decaying Model for Recommender Systems with Matrix Factorization Panagiotis Ardagelou and Avi Arampatzis Database & Information Retrieval research unit, Department of Electrical & Computer Engineering, Democritus University of Thrace, Xanthi 67100, Greece. panosard93@gmail.com, avi@ee.duth.gr Abstract. We propose the use of an exponential decay function for modeling drifts in user interests in collaborative filtering systems. For that purpose, we introduce the notion of half-life of ratings and incorporate it as a bias into a ma- trix factorization model. Experimental results on movie ratings spanning a period of approximately 7 months show that employing a half-life of around 150 days yields large improvements in prediction accuracy, confirming that significant user interest shifts exist over time and that the proposed model offers a viable strategy. 1 Introduction The evolution of recommender or recommendation systems in the past decade has in- fluenced highly our lives, even though many times we are not aware of their existence. Much of the online information which reaches us derives from various recommendation techniques. From the news we read and the products we buy, to the movies we watch and the music we listen to, most of our on-line input is significantly not random. In this way, both from the user perspective and from the service side as well, everyone can take advantage of high quality recommendations because, at first, they increase cus- tomer satisfaction, while they also allow much better quality of services (QoS) from companies’ side, leading to much higher earnings. For these reasons, well-performing recommendations techniques are vital for achieving and conserving high degrees of interaction between companies (services) and customers [4]. In order to assess users’ preferences, recommender systems need to have some kind of feedback from users, either implicit or explicit. As explicit feedback we refer to rat- ings given by users for specific items, based on the rating system each service uses. For example, as explicit feedback can be characterized a ‘like’, or a 5-stars rating system. Implicit information is based on user’s actions which can reveal an interest of a user for a specific item. Such actions include browsing a certain product page, watching a video, etc., and can many times be proven very useful in order to increase the accuracy of recommendations [16]. While there is a variety of ways to produce recommendations, two are the main techniques. The first is content-based filtering which is crafting a user’s profile based on known attributes of the items she has liked or bought. Content-based filtering is capable to provide straight-forward recommendations which are generally easier to un- derstand and more self-explanatory than those created by other techniques, because it Panagiotis Ardagelou, Avi Arampatzis generates user profiles in isolation using only the user’s history. The second technique is called collaborative, or memory-based, filtering. It differentiates from content-based techniques in that it does not build a user’s profile in isolation but it also uses other users’ likings; furthermore, it does not use any item attributes. Collaborative filtering methods can generally be classified into two sub-categories, called neighborhood models and latent factor models. In this paper we deal only with latent factor models. In such models, the utility matrix is usually factorized using matrix factorization techniques, which will be described more thoroughly later in this paper. We chose to work with matrix factorization mainly for its ease in embedding exter- nal information in the modeling process. In addition, it generally provides very good prediction accuracy and relatively fast running times. A milestone in the evolution of recommender systems is set by the well-known Net- flix Prize Challenge which was held between 2007 and 2009 and has revolutionized our knowledge on recommender systems [3, 5]. Ever since, significant research has taken place and many more new versatile recommendation approaches have been brought to life. One of the main methods the winners of the Netflix Prize Challenge employed in their recommendation techniques was the embedding of temporal information, taking advantage of some temporal modeling of users’ preferences [9]. In this paper, we are concerned with temporal modelling schemes. We consider explicit user feedback and time-dependant recommendation techniques using a matrix factorization model which takes full advantage of dimensionality reduction methods for very large and sparse matrices. We propose a novel half-life decaying modeling embedded into a matrix factorization process which takes advantage of the temporal in- formation available in the dataset we use for our experimental purposes. In this respect, we generate a blend of recommendations based on the more recent likings of each spe- cific user while not completely ignoring her past likings which still carry significant information about her preferences. 2 Related Work Several past studies have shown that incorporating new information into modeling can lead to a boost in prediction accuracy. For example, some studies use external informa- tion from social networks in order to create social regularization models which capture the social relationships between users and provide recommendations based on each user’s social neighborhood [13, 18]. While the social aspect is one important class of information which can boost the prediction accuracy of a recommender system, another one which has received increasing attention in the past years is time. Taking the most advantage out of the temporal information available in the data has been proven to be of vital importance in the field of so-called time-aware recom- mender systems (TARS) as it improves the recommendation performance. One of the first temporal approaches which leads to increased recommendation accuracy sees the collaborative filtering task as a time-series problem, binning the data into different tem- poral bins [19]. Another one, based on implicit purchase data, takes into consideration the launch time of an item and the time it has been purchased to create recommendation based on the more recent purchases which reflect better the current preferences of a user A Half-Life Decaying Model for Collab. Filt. with Matrix Fact. [11, 12]. Other studies, such as the one of the winning team of Netflix’s competition [9], use external information in order to capture temporal effects within the data and create a more precise way of modeling users’ preferences in time and items’ popularity as well. In this paper, within the general framework of collaborative filtering systems with temporal information, we introduce a new half-life decaying model by directly employ- ing a weighting function. The idea comes from [1] where the authors introduced such a function into training user profiles with Rocchio in a content-based filtering context. In experiments with the OHSUMED corpus, a collection of medical abstracts spanning a period of five years, they found that effectiveness peaks when using a half-life of around 4 years on average in training. Nevertheless, further analysis revealed that there is a great distribution of optimal half-life values across topics. In our work, we adapt this idea to a collaborative filtering setup. 3 Matrix Factorization A widely-used technique in many data mining and machine learning problems is dimen- sionality reduction. Broadly speaking, when too many features are available to describe a system’s parameters, these features may form subsets and thus eliminate the need of referring to each specific one separately. By doing so, the computational cost decreases significantly and the whole process becomes more efficient as we end up from a high- dimensional space to a space of fewer dimensions. In order to perform dimensionality reduction, we have to factorize the utility matrix. There are several ways to do so, with singular value decomposition and UV decompo- sition being two of the most common. We perform UV decomposition which means that we try to create two new thinner matrices of dimensions m × k and n × k, so that the product of the first and the second transposed matrix gives the initial utility matrix. In our case, we do not want the product to reproduce exactly the initial utility matrix otherwise there will be no variation available in order to fill the initially blank entries, and thus, generate predictions. An exact reproduction would recreate the blank entries of the initial utility matrix and so there would be no predictions. Let us denote an entry in the initial utility matrix, i.e. a known rating, as ru,i , and the predicted value for this rating as r̂u,i . Furthermore, each user u is described by a vector pu ∈