<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Hybrid algorithms for recommending new items in personal TV</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Fabio</forename><surname>Airoldi</surname></persName>
							<email>fabio.airoldi@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution" key="instit1">Politecnico di Milano</orgName>
								<orgName type="institution" key="instit2">DEI P.zza Leonardo da Vinci</orgName>
								<address>
									<postCode>32</postCode>
									<settlement>Milano</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Paolo</forename><surname>Cremonesi</surname></persName>
							<email>paolo.cremonesi@polimi.it</email>
							<affiliation key="aff1">
								<orgName type="institution" key="instit1">Politecnico di Milano</orgName>
								<orgName type="institution" key="instit2">DEI P.zza Leonardo da Vinci</orgName>
								<address>
									<postCode>32</postCode>
									<settlement>Milano</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Roberto</forename><surname>Turrin</surname></persName>
							<email>roberto.turrin@moviri.com</email>
							<affiliation key="aff2">
								<orgName type="institution">Moviri -R&amp;D Via Schiaffino</orgName>
								<address>
									<postCode>11</postCode>
									<settlement>Milano</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Hybrid algorithms for recommending new items in personal TV</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">15DF2C514B7FA3EDB590609AE58634A3</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T01:23+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Recommending TV programs in the interactive TV domain is a difficult task since the catalog of available items is very dynamic, i.e., items are continuously added and removed.</p><p>Despite recommender systems based on collaborative filtering typically outperform content-based systems in terms of recommendation quality, they suffer from the new item problem, i.e., they are not able to recommend items that have few or no ratings.</p><p>On the contrary, content-based recommender systems are able to recommend both old and new items but the general quality of the recommendations in terms of relevance to the users is low.</p><p>In this article we present two different approaches for building hybrid collaborative+content recommender systems, whose purpose is to produce relevant recommendations, while overcoming the new item issue. The approaches are tested on the implicit ratings collected from 15'000 IPTV users over a period of six months.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>Interactive television allows providers to deliver to their costumers a huge amount of digital content. Since discovering interesting items can be difficult in such wide collections, recommender systems are used to provide the users with personalized lists of items that they may like.</p><p>Recommendations are based on the user preferences, referred to as ratings, that the system has gathered either explicitly (typically in the 1 . . . 5 scale) or implicitly (typically in binary format: 1 if the user likes an item, 0 otherwise). In the interactive TV (iTV) domain ratings are often implicit <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b12">13,</ref><ref type="bibr" target="#b11">12]</ref>, inferred by tracking the users' activity (e.g., the purchased movies or the watched TV programs).</p><p>Item catalogs in the settings of iTV are intrinsically subject to frequent modifications: new programs are usually added as soon as they are available and old content becomes no longer available. We can identify: (i) A set of items that are repeated over time, such as television seasons, weekly talk shows, or reruns of movies.</p><p>We can assume that a number of users have watched these items, thus providing implicit ratings.</p><p>(ii) A set of items that are shown for the first time, such as the first showing of a movie. We can assume that no ratings have been collected about these items.</p><p>Most recommender systems are based on collaborative filtering algorithms, i.e., they recommend items on the basis of the preferences of users similar to the target user. However, since no ratings have been collected for new programs, they cannot be recommended by collaborative recommender systems. This issue is known as the new-item problem.</p><p>Alternatively to collaborative filtering, recommender systems can implement content-based filtering algorithms. Since content-based approaches base their predictions upon the description of TV programs in terms of features -such as genre and actors -they are not influenced by the lack of ratings. Unfortunately, collaborative algorithms have been systematically proven to outperform content-based algorithms in terms of recommendation quality, measured by standard accuracy (e.g., recall and precision) and error metrics (e.g., RMSE and MAE). As an example, if we consider the stateof-the-art recommender algorithms implemented in <ref type="bibr" target="#b1">[2]</ref>, in our experiments the collaborative approach reaches a recall equals to 19%, while the content-based approach does not go over 3%.</p><p>In the settings of iTV, we would like to build a recommender system not affected by the new-item problem, but with a quality comparable to collaborative filtering. Several hybrid algorithms have been proposed in the literature merging into a unique algorithm both content-based and collaborative filtering. Some of them have been even proven to outperform base collaborative recommender systems in therms of quality. However, the proposed solutions are rarely used in production environments mainly because of scalability issues. Furthermore, most approaches are designed to work in the case of explicit, non-binary ratings, without any particular focus on the new-item problem.</p><p>In this work we present two families of hybrid collabo-rative+content algorithms which are specifically designed to respect the requirements of commercial real-time recommender systems. We take into account state-of-the art collaborative and content-based recommender algorithms typically used in the presence of implicit, binary ratings. Each hybrid solutions is composed by one collaborative and one content-based algorithm.</p><p>The main idea behind the first family of hybrid algorithms is to augment the existing ratings used for training the collaborative algorithm with additional ratings estimated with the content-based algorithm. Diversely, the second family of hybrid algorithms merges together the item-to-item similarities computed by the collaborative and the content-based algorithms.</p><p>As a comparison, we also implemented a state-of-the-art and a trivial hybrid algorithm. The latter simply mixes in a unique recommendation list the items recommended with the collaborative and those recommended with the contentbased algorithm.</p><p>The recommendation quality of the hybrid algorithms has been evaluated in terms of recall against the quality of baseline algorithms on the dataset implicitly collected by an IPTV provider over a period of six months. A specific testing methodology has been designed in order to evaluate the quality of recommender algorithm in presence of new items.</p><p>In Section 2 we present an overview of the existing panorama regarding hybrid recommender systems, with particular focus on the algorithms that are most promising in the dynamic iTV scenario. In Section 3 we describe the reference state-of-the-art algorithms we included in our experiments. In Section 4 we illustrate the new algorithms that we developed. Section 5 details the dataset we used for the evaluation and on the testing methodologies we adopted. Section 6 shows and discusses the results we obtained. Finally, Section 7 draws the conclusions and leads on some possible future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">RELATED WORK</head><p>Recently, several television operators have considered the integration of a recommender system into their architectures in order to provide personalized content (e.g., <ref type="bibr" target="#b1">[2]</ref>).</p><p>The need for recommender systems in TV applications is motivated by the fact that users generally appreciate to receive personal suggestions generated according to their dynamically updated profiles, as shown in <ref type="bibr" target="#b25">[26]</ref>. The same study reported also that customers prefer to be recommended with items similar to the ones that they already rated, but also with items that their friends have enjoyed.</p><p>Recommender systems used to deliver personalized TV experiences are typically content-based <ref type="bibr" target="#b35">[36]</ref>, collaborative <ref type="bibr" target="#b34">[35]</ref> or hybrid solutions <ref type="bibr" target="#b33">[34]</ref>.</p><p>In the next paragraphs we summarize the limitations of non-hybrid recommender systems and we present an overview of the most interesting existing algorithms suitable personalized TV purposes.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Drawbacks of standard algorithms</head><p>The two main families of recommender algorithms are the content-based and the collaborative filtering.</p><p>The former are based on the analysis of the content of items (e.g., genre and actors). On the contrary, the latter suggest items on the basis of the preferences of users similar to the target user.</p><p>Collaborative algorithms have recently received much more attention than content-based solutions. The main reason is that they usually reach a higher recommendation quality than content-based systems <ref type="bibr" target="#b16">[17,</ref><ref type="bibr" target="#b7">8]</ref>. Furthermore, while collaborative algorithms can be easily implemented in any domain, content-based system are much more complex because they require to analyze the items' content (e.g., parsing textual data) <ref type="bibr" target="#b22">[23]</ref>.</p><p>However, since collaborative algorithms are based on the user ratings they have the following major drawbacks <ref type="bibr" target="#b0">[1]</ref>:</p><p>• New-item. Collaborative filtering is particularly affected by the new-item problem, being not able to recommend items that have received few or no ratings because the system does not have enough information.</p><p>• Popularity bias. Collaborative filtering is biased towards the most popular items, i.e., items which have been rated by many users are more likely to be recommended than items that have few ratings.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Hybrid algorithms</head><p>During the last years several approaches have tried to overcome to the drawbacks of single recommender approaches by combining them into new hybrid recommender algorithms, from simple implementations (e.g., <ref type="bibr" target="#b33">[34,</ref><ref type="bibr" target="#b24">25,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b5">6]</ref>) up to very complex algorithms, such as the BellKor solution winning the Netflix prize <ref type="bibr" target="#b2">[3]</ref>, which combines predictions from 107 different baseline recommender systems <ref type="bibr" target="#b3">[4]</ref>. The idea of merging multiple predictors has been often used in the settings of machine learning to improve the classification accuracy (e.g., <ref type="bibr" target="#b20">[21]</ref>).</p><p>Burke performed an extensive survey and classification of hybrid recommender systems <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b6">7]</ref>.</p><p>Mixed algorithms present simultaneously the items recommended by two different recommender algorithms. As an example, Smyth and Cotter <ref type="bibr" target="#b33">[34]</ref> show in the same interface recommendations generated by a content-based and a collaborative algorithm.</p><p>In case the confidence of a prediction is measurable, it is possible to implement a switched algorithm, that selects the best algorithm to use according to some confidence estimates. For instance, the system 'NewsDude' <ref type="bibr" target="#b4">[5]</ref> combines a content-based nearest-neighbor recommender, a collaborative recommender, and naïve Bayes classifier. Ratings are predicted by the algorithm which shows the highest confidence and then the user is recommended with the items that have the highest predictions.</p><p>Weighted algorithms compute a linear combination of the ratings predicted by two (or more) recommender algorithms. A similar approach has been proposed by Mobasher et. al. <ref type="bibr" target="#b24">[25]</ref>, that linearly combines item-to-item similarities generated by different recommender algorithms. Differently from other hybrid solutions, this method can be used on implicit datasets (see Section 3.3 for further details).</p><p>Meta-level recommender systems use the model produced by an auxiliary algorithm for training the primary algorithm. As an example, the restaurant recommender proposed by Pazzani in <ref type="bibr" target="#b27">[28]</ref> builds a feature-based model of the users using content-based information. This model is then used by a collaborative algorithm to compute recommendations.</p><p>Melville et al. propose a feature-augmentation algorithm denoted "Content boosted collaborative filtering" (CBCF) <ref type="bibr" target="#b23">[24]</ref>, which Burke's survey <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b6">7]</ref> reports as one of the best algorithms. Their approach basically consists in creating 'augmented user profiles' by adding 'pseudo-ratings' to original user profiles prior to generating recommendations. Pseudoratings are generated using a content-based naïve Bayes classifier and can be interpreted as the ratings that users would give to unrated items, given the items' features. Rating prediction is computed with a variant of the user-based collaborative approach, where user-to-user similarities are computed as the Pearson correlation coefficient between the original user profiles and the augmented user profiles and the weights assigned to pseudo-ratings depend on the number of rated and co-rated items for each user. Melville et. al. reported that their algorithm performed better than the content-based naïve Bayes and the user-based collaborative algorithms in terms of MAE (Mean Absolute Error). They also showed that their algorithm is less susceptible to problems induced by sparsity: at 99.9% sparsity their algorithm performed exactly like the content-based baseline.</p><p>Regardless several hybrid algorithms have been brought to the scientific attention during the last few years, there is no exhaustive literature on hybrid recommender algorithms studied to alleviate the new-item problem, with the only notable exception being the work of Schein et. al. <ref type="bibr" target="#b32">[33,</ref><ref type="bibr" target="#b31">32]</ref>. They describe a generative algorithm with the specific purpose to have high performance in cold-start situations. In their work they present a multi-aspect probabilistic model which is used to compute the probability that an item is liked or not by a user. The aspects include collaborative data as well as content-based data and one or more latent aspect. The hidden aspects are used to model the hypothesis that, as an example, a user liked a specific movie because of some particular, latent motivation. This kind of algorithms, along with association rule mining, neural networks and Boltzman machines, even if promising, are however difficult to integrate in existing systems, as they use a completely different approach from traditional collaborative filtering and do not properly scale with the number of users and items.</p><p>Also, hybrid algorithms are subject to a series of problematics. First, many of them (e.g., weighted hybrids) need to use underlying algorithms that are able to perform rating predictions. This makes them useless on implicit domains like iTV applications: algorithms which rely on predicting ratings strictly need explicit ratings to work properly as they are usually optimized with respect to error metrics (such as MAE or RMSE -e.g., <ref type="bibr" target="#b21">[22]</ref>). However, the presence of only implicit, binary ratings (as in our datasets) does not allow to use this set of algorithms since we do not have proper ratings.</p><p>Additionally, some of the algorithms have scalability issues. As an example, CBCF <ref type="bibr" target="#b23">[24]</ref> requires augmented user profiles in order to produce recommendations. Since it is impossible to store the augmented URM (it would be a full matrix), they need to be generated at real-time. However, this requires to compute the pseudo ratings for all the user in the neighborhood using a naive bayes classifier. Using the neighborhood size suggested by Melville et. al., this would mean generating 30 recommendation lists using the bayesian classifier just to produce one hybrid recommendation list.</p><p>Other simpler hybrid algorithms belonging to the weighted, switching, and mixed categories, need to generate at realtime a number of recommendation lists which is equal to the number of the underlying baseline recommenders. The consequence of this is that even the simplest algorithm belonging to those categories (e.g., the interleaved approach described in Section 3.3) would have a halved throughput with respect to a non-hybrid solution.</p><p>The only state-of-the-art hybrid approach with a complexity comparable to non-hybrid algorithms is SimComb <ref type="bibr" target="#b24">[25]</ref>; however, we did not find any reference of this solution applied to implicit datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">STATE-OF-THE-ART ALGORITHMS</head><p>Recommender algorithms usually collect user ratings in a user-by-item matrix, from here on referred to as User Rating Matrix (URM) and denoted by R, where rui is the rating given by user u to item i. The u-th row of this matrix is denoted by ru and represents the profile of user u.</p><p>The URM is typically very sparse (users rate, on average, only a few items). In the settings of iTV we assume to have only implicit feedbacks, so that the URM contains bi-nary ratings, where the value 1 and 0 indicate, respectively, whether the TV program has been watched or not.</p><p>Collaborative filtering algorithms analyze the URM to identify similarity between either users (user-based) or items (item-based), or hidden relations between users and items (latent factors). Item-based collaborative algorithms have been proved to outperform user-based approaches in terms of scalability and recommendation quality <ref type="bibr" target="#b0">[1]</ref>. They discover item-to-item similarities using metrics such as the cosine or the adjusted cosine similarity <ref type="bibr" target="#b29">[30]</ref>. Rating prediction for a specific item is computed by considering the ratings given by the target user to items similar to it (denoted neighbors). Finally, latent factors collaborative algorithms are typically based on singular value decomposition (SVD) to extrapolate underlying relations between users and items <ref type="bibr" target="#b30">[31,</ref><ref type="bibr" target="#b21">22]</ref>.</p><p>Content-based recommender systems base recommendation on the features extracted from the items' content, such as: the actors, the genre, and the summary of a movie. The typical bag-of-word approach represents items as vectors of features stored into a feature-by-item matrix, which is referred to as Item Content Matrix (ICM) and denoted by W. Items' content requires to be processed in order to identify terms (tokenizing), to remove useless words (stop words), to normalize words (stemming), and to weight each feature (weighting). As for the latter, weights are usually computed using the TF-IDF metric <ref type="bibr" target="#b28">[29]</ref>. Users can be represented as feature vectors, composed by the features of the movies they have watched. Rating prediction can be computed as the similarity between user and item feature vectors.</p><p>In the following section we describe the state-of-the-art recommender algorithms we have taken into consideration. They comprise two collaborative filtering algorithms, one content-based algorithm, and two hybrid algorithms.</p><p>The algorithms we have considered are designed to be used with implicit, binary datasets. Furthermore, the algorithms share the property of not requiring any user-specific parameters to be learned in advance. As a consequence, these algorithms are able to recommend any user at real-time on the basis of his/her most recent ratings.</p><p>The state-of-the-art collaborative and content-based algorithms will be used in Section 4 for defining two families of hybrid recommender algorithms, each one combining one content-based and one collaborative filtering: filtered feature augmentation and similarity injection.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Collaborative</head><p>In the following we present one item-based (NNCosNgbr) and one latent factor (PureSVD) state-of-the-art collaborative algorithms suitable for predicting top-N recommendations in the case of binary ratings.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Non-normalized cosine KNN (NNCosNgbr).</head><p>This is a state-of-the-art item-based collaborative filtering algorithm described in <ref type="bibr" target="#b19">[20]</ref>. Note that in the case of binary ratings we cannot compute similarity metrics such as the Pearson coefficient and the Adjusted Cosine, thus the itemitem similarity has been measured as the cosine similarity.</p><p>Rating prediction for item i has been computed by summing up the similarities between item i and its neighbors, where the set of neighbors -denoted by D k (u; i) -has been limited to the k most similar items. This approach is usually known as knn (k nearest neighborhood). Rating prediction is computed as:</p><formula xml:id="formula_0">rui = j∈D k (u;i) ruj • s CF ij (1)</formula><p>where s CF ij represents the cosine similarity between item i and j, computed as 1 :</p><formula xml:id="formula_1">s CF ij = ri • r ⊺ j ri 2 • rj 2<label>(2)</label></formula><p>This algorithm has been proved to have a good quality, as shown in <ref type="bibr" target="#b10">[11]</ref> and <ref type="bibr" target="#b14">[15]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>PureSVD.</head><p>PureSVD is a latent factor collaborative algorithm, proved to grant high recommendation quality in terms of recall <ref type="bibr" target="#b10">[11]</ref>.</p><p>Let f denote the number of latent factors used for representing users and items. Typical values of f are in the range <ref type="bibr">[50,</ref><ref type="bibr">300]</ref>. SVD allows to factorize the URM into three f -dimensional matrices -U, Σ, and Q -and to compute a f -rank approximation of the original URM:</p><formula xml:id="formula_2">R f = U • Σ • Q ⊺ (3)</formula><p>where U and Q are orthonormal matrices and Σ is diagonal. User u and item i are represented by f -dimensional vectors, respectively: pu and qi. Rating prediction is computed using the inner product:</p><formula xml:id="formula_3">rui = pu • q ⊺ i<label>(4)</label></formula><p>If we define P = Σ • U, we can observe that, since U and Q are orthonormal, P = R • Q and (4) can be rewritten as:</p><formula xml:id="formula_4">rui = ru • Q • q ⊺ i<label>(5)</label></formula><p>Finally, observe that the inner product qi • q ⊺ j represents the similarity between items i and j.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Content-based</head><p>Differently to collaborative filtering, content-based recommender systems rely only on content features. In the following we describe LSA <ref type="bibr" target="#b1">[2]</ref>, a well-known approach based on SVD.</p><p>Latent Semantic Analysis (LSA). Latent Semantic Analysis <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b18">19,</ref><ref type="bibr" target="#b1">2]</ref> uses SVD to reduce the dimensionality of the ICM and to capture underlying relations -like synonymybetween items. The weights in the ICM are computed using the TF-IDF metric <ref type="bibr" target="#b28">[29]</ref>. Let l be the number of factors -typically referred to as latent size -to be used for representing items' features. The ICM can be factorized and approximated as:</p><formula xml:id="formula_5">W l = Z • Λ • Y ⊺<label>(6)</label></formula><p>Let us define B = YΛ. Similarity between items i and j is denoted by s CBF ij and can be computed as the cosine between vectors bi and bj:</p><formula xml:id="formula_6">s CBF ij = bi • b ⊺ j bi 2 • bj 2<label>(7)</label></formula><p>and rating prediction is computed as:</p><formula xml:id="formula_7">rui = ru • B • b ⊺ i (8)</formula><p>1 x • y denotes the inner product between vectors x and y, and x 2 denotes the Euclidean norm of vector x.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Hybrid algorithms</head><p>In the following sections we present two families of hybrid recommender systems based on one collaborative filtering and one content-based filtering. The requirement for an algorithm to be used in the second solution is to allow to derive a similarity measure between items. Thus, all state-of-the art collaborative and content-based algorithms presented in Sections 3.1 and 3.2 can be used in the following hybrid approaches.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Interleaved.</head><p>This algorithm is a trivial hybrid implementation that forms the recommendation list to suggest to the target user by alternating, in turn, one item predicted by the collaborative algorithm and one predicted by the content-based algorithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>SimComb.</head><p>Mobasher et. al. <ref type="bibr" target="#b24">[25]</ref> developed a weighted hybrid algorithm in which item-to-item similarity values are computed as the linear combination between content-based and collaborative similarities:</p><formula xml:id="formula_8">cij = α • s CBF ij + (1 − α) • s CF ij (9)</formula><p>where s CBF ij and s CF ij are computed, respectively, using ( <ref type="formula" target="#formula_1">2</ref>) and <ref type="bibr" target="#b6">(7)</ref>. Finally, rating prediction can be computed using <ref type="bibr" target="#b0">(1)</ref>, where s CF ij is to be replaced with cij. We refer to this algorithm as simComb.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">PROPOSED ALGORITHMS</head><p>The algorithms we present in the following sections are designed to overcome some of the limitations of existing algorithms. Our first priority was to design hybrid recommender systems able to work on implicit, binary ratings, and to grant good recommendation quality even when only content information is available (new items). In addition, we also focus on scalability in order to design algorithms to be used in real iTV scenarios.</p><p>In the next two sections we present two families of hybrid solutions: Filtered Feature Augmentation and Similarity Injection Knn.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Filtered Feature Augmentation (FFA)</head><p>Filtered Feature Augmentation is a feature augmentation algorithm mainly inspired by CBCF (Content Boosted Collaborative Filtering) <ref type="bibr" target="#b23">[24]</ref>. Differently from Mellville's solution, our approach 1. does not need any user-specific parameter to be learned;</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">allows to use different content-based and collaborative filtering;</head><p>3. computes item-item similarities on the basis of the ratings augmented with pseudo-ratings derived from the content-based filtering, but use the original user profiles for predicting ratings (as opposed to computing item-item similarities using the original URM and augmenting user profiles for predicting ratings).</p><p>Figure <ref type="figure" target="#fig_0">1</ref> shows the learning process: the CBF trains the content-based algorithm and computes the pseudo-ratings for unknown rating values to be added to the original URM. The augmented URM (namely, aURM) is used as input for the collaborative filtering.</p><p>Since adding all pseudo-ratings would lead to a dense, very large augmented URM, the filter selects only the most relevant pseudo ratings. In our experiments we used two different filters: a simple one which excludes all the pseudoratings which are lower than a fixed threshold (FFAt) and a more sophisticated one which use the Gini impurity measure <ref type="bibr" target="#b13">[14]</ref> in order to add both high and low pseudo ratings to increase the intrinsic information to the item profiles (FFAg). The Gini coefficient is defined as:</p><formula xml:id="formula_9">Gini(v) = 1 − x∈v p 2 x (<label>10</label></formula><formula xml:id="formula_10">)</formula><p>where px is the probability of x in v. In our specific case v is an item profile (i.e., a column of the URM) and px = nx n , where nx is the number of ratings equal to x in v and n is the number of ratings in v. When Gini(v) = 0, v is pure (and brings almost no information). As we want to add informative pseudo-ratings, the filter let only pass the pseudo-ratings that increment the most the Gini index for each item. This is done until at least g pseudo ratings are added to each item-profile. The value of g depends on the number of original ratings for each user profile (denoted by n):</p><formula xml:id="formula_11">g = nmin − n + (h • n) if n ≤ nmin (h•n 2 min ) n otherwise (<label>11</label></formula><formula xml:id="formula_12">)</formula><p>where nmin and h are parameters. In our experiments we used the average number of ratings as the value for nmin and 0.3 for h.</p><p>Rating prediction has been computed by using <ref type="bibr" target="#b0">(1)</ref>, where rui are the the original user ratings, and s CF ij is the similarity between items i and j computed using (2) on the augmented URM.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Similarity Injection Knn (simInjKnn)</head><p>Similarity Injection Knn builds a model using item-item similarities obtained by one collaborative and one contentbased.</p><p>We first compute item-item similarities s CF ij using collaborative filtering, retaining only the k most similar items. Similarly, we compute item-item similarities s CBF ij using contentbased filtering.</p><p>The similarities are later merged into a unique item-item similarity matrix S, by adopting a two-step process:  • all the elements s CF ij 's are copied into S. Thus, each items is filled with k similarity values deriving from the collaborative filtering.</p><p>• the elements s CBF ij 's are later inserted into the corresponding empty (e.g., zeros) elements of S in such a way to have, for each item, a set of k additional similarity measures deriving from the content-based filtering.</p><p>At the end of the construction, each item will have a total of 2k similar items:</p><p>• the k-most collaborative-based similar items</p><p>• the k-most content-based similar items not appearing in the previous set .</p><p>Exactly half the similarities come from the content-based technique and half from the collaborative technique.</p><p>The framework allows different combinations of item-based collaborative and content-based filtering. In this work we present the results based on LSA as content-based algorithm and NNCosNgbr as collaborative filtering, since they reached the highest quality in terms of recall. Thus, s CF ij and s CBF ij have been computed using ( <ref type="formula" target="#formula_1">2</ref>) and ( <ref type="formula" target="#formula_6">7</ref>), respectively. Finally, rating prediction is estimated with <ref type="bibr" target="#b0">(1)</ref>, where s CF ij is replaced by sij.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">DATASET AND EVALUATION METHOD-OLOGY</head><p>We evaluated our algorithms using a variant of the methodology described in <ref type="bibr" target="#b21">[22,</ref><ref type="bibr" target="#b10">11]</ref>, designed to evaluate the performance of recommender algorithms with focus on the newitem problem.</p><p>We used the dataset provided by an IPTV provider<ref type="foot" target="#foot_0">2</ref> . It contains 25765 binary ratings implicitly collected over a period of six months given by 15563 users on 794 items (0.0021 sparsity). Content-based features comprise: actors, directors, category (e.g., movie, series, documentary. . . ), title, genres, summary and language, for a total of 11291 features and an average of 18 features per item. The main characteristics of the available features are summarized in Table <ref type="table">1</ref>.</p><p>The evaluation of recommender systems is typically performed by using either error metrics or accuracy metrics <ref type="bibr" target="#b17">[18]</ref>. Since error metrics -such as MAE (mean absolute error) and RMSE (root mean square error) -rely on computing the error between actual and predicted ratings, they cannot be measured on implicit, binary datasets where this information is not available <ref type="bibr" target="#b19">[20,</ref><ref type="bibr" target="#b14">15]</ref>.</p><p>For such reason we focus on accuracy metrics, that estimate the fraction of relevant items which are actually recommended (recall) or the fraction of the recommended items that are actually relevant (precision). In addition, recent works <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b26">27]</ref> consider accuracy metrics as more suitable, with respect to error metrics, for evaluating the top-N recommendation task <ref type="bibr" target="#b17">[18]</ref>, i.e., the capability of the recommender system to suggest very limited lists of items that are likely to be of interest for the users.</p><p>The standard definition of recall -which is typically used in the settings of information retrieval -is:</p><formula xml:id="formula_13">recall = |relevant ∧ retrieved| |relevant| (12)</formula><p>Usually, in the settings of recommender systems, the set of relevant items is composed by items positively rated (e.g., if the rating is 5 out of 5). However, since we are facing with an implicit, binary dataset, in our evaluation we considered -analogously to other works such as <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b12">13</ref>]all rated items to be relevant, as we do not have any further information about the degree of user satisfaction.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Performance on new items</head><p>In order to specifically evaluate the impact of new items on the quality of the different algorithms we developed a testing methodology, that we refer to as sliding window, which is an extension of the evaluation methodology presented in <ref type="bibr" target="#b21">[22,</ref><ref type="bibr" target="#b10">11]</ref>.</p><p>The original approach evaluates the quality of recommender algorithms by measuring the recall as a function of the number of items displayed to the user (N ). The test consists in excluding a certain amount of ratings (test set) and using the remaining ratings to train the algorithms (training set). All available content-based features are used for training the content-based and hybrid algorithms. Once an algorithm has been trained with the ratings in the training set, each rating rui in the test set is tested as follows:</p><p>• we predict the score for all items unrated by user u</p><p>• we select the top-N items according to the estimated score</p><p>• if item i appears in the top-N recommendation list, we have a 'hit', i.e., the system has correctly suggested a relevant item.</p><p>With respect to <ref type="bibr" target="#b11">(12)</ref>, the set of relevant items corresponds to the test set, while the set of relevant, retrieved items corresponds to the number of hits. Thus, recall can be rewritten as a function of N , i.e., the number of items displayed to users:</p><formula xml:id="formula_14">recall(N ) = #hits |test-set|<label>(13)</label></formula><p>Because of the high dataset sparsity, we have formed the test set by randomly selecting the 20% of ratings in order to have a significant number of samples. This evaluation methodology is not able to measure the quality of the recommendations on new item problem. Therefore we have implemented some modifications. Let M denote the number of items in the dataset. We randomly divide items in two sets, H1 and H2, each one composed by M/2 items. Test set and training set are defined as a function of the percentage parameters β as follows:</p><p>• we define a set T by randomly selecting 20% of ratings in the URM.</p><p>• the training set is defined as the set of ratings related to items in H1, excluding all ratings in T .</p><p>• we form a set H1+2 composed by M/2 items, 100 − β% randomly extracted from set H1 and β% randomly extracted from set H2.</p><p>• test set is composed by the ratings in T related to items in H1+2. For each value of β we have composed a training and a test set and we have computed the quality of the recommender algorithm in terms of recall, as defined in <ref type="bibr" target="#b12">(13)</ref>, where N has been set equal to 20. The test is the same as the one described for the original evaluation methdology. For each rating rui in the test set: (i) we predict the score for all items unrated by user u, (ii) we select the top-20 items according to the estimated score, and (iii) we verify if there has been a hit, i.e., if item i appears in the top-20.</p><p>Note that only ratings related to half the items are available for training the algorithms. The parameter β specifies the percentage of new-items, i.e., the percentage of items that do not have ratings in the training set and so that cannot be recommended by standard collaborative filtering. In our experiments we varied β from 0% to 100%.</p><p>Collaborative filtering is expected to be able to recommend only items in H2, so the higher β the lower the quality of the algorithm. When β = 100% we expect the quality of any collaborative filtering to be 0.</p><p>On the other hand, content-based algorithms are trained exclusively with content-based features, thus resulting totally independent from ratings included into the training set. We expect their quality not to be influenced by β. Finally, hybrid approaches can be training by the ratings related to half the items and with all available content-based features.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">RESULTS AND DISCUSSION</head><p>In our test we considered the state-of-the-art recommender algorithms described in Section 3 and the hybrid approaches proposed in Section 4. As for the former, we included two item-based collaborative algorithms -PureSVD and nonnormalized cosine (NNCosNgbr) -a content-based algorithm -Latent Semantic Analysis (LSA) -and two hybrid algorithms -interleaved and simComb. As for the latter, we included simInjKnn and two variants of filtered feature augmentation, referred to as FFAg and FFAt. FFAg uses a filter based on Gini's impurity measure, while FFAt uses a filter based on a fixed threshold, that has been set to 1, thus excluding all the pseudo-ratings lower than 1 (the number of pseudo-ratings added to the URM was the 27.5% of the original number of ratings).</p><p>The latent size l of LSA has been set to 300, while the number of features f in PureSVD is equal to 50. The neighborhood size k has been set equals to 200 for NNCosNgbr. As for SimComb, the coefficient α has been set to 0.3 as it empirically provides the better results. Finally, the neighborhood size k used for simInjKnn is 100.</p><p>Figure <ref type="figure">3</ref> shows the sliding window results, plotting the recall of the state-of-the-art (a) and the proposed hybrid (b) algorithms as a function of β, i.e., the percentage of new items. The recall has been computed assuming N = 20.</p><p>The most perceptible result is the very poor quality of the content-based algorithm, whose recall does not go over 3%, much lower than the best collaborative approach -NNCos-Ngbr -whose recall is about 25% when recommending only old items. In addition, we can observe that the three hybrid algorithms -simComb, FFAt, and simInjKnn -have a recall higher than collaborative and content-based state-of-the-art solutions. As for the other two hybrid algorithms: the interleaved approach confirms to have a poor quality as based on a very trivial technique, while the lower quality of FFAg is motivated by the fact that binary ratings do not bring enough information for the filter to work properly. However, the results of the sliding window test show that FFAt and SimInjKnn outperform the state-of-the-art algorithm Sim-Comb when more than 40% of items is new. In addition, we can notice that recall drastically falls down when β = 100%, with exception for the content-based algorithm that is not influenced by the presence of unrated items.</p><p>As a final observation, let us consider the performance implications of the proposed hybrid solutions. The Similarity Injection Knn (SimInjKnn) requires to build two item-toitem similarity matrices, which is the same effort as required for the interleaved algorithm. Filtered Feature Augmentation (FFAg and FFAt), on the other hand, requires to compute pseudo-ratings for each item. According to the number of items and users, this can be computationally costly. The main benefit of Filtered Feature Augmentation is that it is completely independent from the collaborative and contentbased algorithms used, as opposed to Similarity Injection Knn, which strictly requires algorithms able to express itemitem similarities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">CONCLUSIONS AND FUTURE WORKS</head><p>The new hybrid algorithms we proposed have been proved to improve the overall performance of the state-of-the-art algorithms in suggesting new items, though their performance is limited by the poor quality of content-based algorithms.</p><p>In future work we plan to implement between-subjects controlled experiments for a subjective evaluation of the proposed solutions. In fact, recent works have pointed out that objective evaluation of recommender systems (i.e., based on error and accuracy metrics) is not always aligned with the actual user perception of the recommendation quality, as measured via controlled experiments (e.g., <ref type="bibr" target="#b8">[9,</ref><ref type="bibr" target="#b26">27]</ref>).</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Model generation for FFA. The CBF stage comprises content-based model generation and recommendations for every item.</figDesc><graphic coords="5,123.14,53.82,100.40,100.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Table 1 :</head><label>1</label><figDesc>Dataset characteristics: number of different values for each type of feature (a) and number of items for each language (b).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Sliding window. Blank circles represent ratings in the set T , solid circles represent ratings in the test set, and x's represent ratings in the training set.</figDesc><graphic coords="6,361.04,53.83,150.60,114.44" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 2</head><label>2</label><figDesc>Figure 2 schematically shows how the different sets are formed. Blank circles, solid circles, and x's refer to the ratings, respectively, in the set T , in the test set, and in the training set.For each value of β we have composed a training and a test set and we have computed the quality of the recommender algorithm in terms of recall, as defined in<ref type="bibr" target="#b12">(13)</ref>, where N has been set equal to 20. The test is the same as the one described for the original evaluation methdology. For each rating rui in the test set: (i) we predict the score for all items unrated by user u, (ii) we select the top-20 items according to the estimated score, and (iii) we verify if there has been a hit, i.e., if item i appears in the top-20.Note that only ratings related to half the items are available for training the algorithms. The parameter β specifies the percentage of new-items, i.e., the percentage of items that do not have ratings in the training set and so that cannot be recommended by standard collaborative filtering. In our experiments we varied β from 0% to 100%.Collaborative filtering is expected to be able to recommend only items in H2, so the higher β the lower the quality of the algorithm. When β = 100% we expect the quality of any collaborative filtering to be 0.On the other hand, content-based algorithms are trained exclusively with content-based features, thus resulting totally independent from ratings included into the training set. We expect their quality not to be influenced by β. Finally, hybrid approaches can be training by the ratings related to half the items and with all available content-based features.</figDesc></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">Ratings refers to the dataset TV2 available at http://home.dei.polimi.it/cremones/memo/</note>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0" />			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions</title>
		<author>
			<persName><forename type="first">G</forename><surname>Adomavicius</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tuzhilin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="734" to="749" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
	<note>IEEE Transactions on</note>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Bambini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cremonesi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Turrin</surname></persName>
		</author>
		<title level="m">Recommender Systems Handbook, chapter A Recommender System for an IPTV Service Provider: a Real Large-Scale Production Environment</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
	<note>to appear</note>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">The BellKor solution to the Netflix Prize</title>
		<author>
			<persName><forename type="first">R</forename><surname>Bell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Koren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Volinsky</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">M</forename><surname>Bell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Koren</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Seventh IEEE International Conference on Data Mining (ICDM 2007)</title>
				<imprint>
			<date type="published" when="2007-10">Oct. 2007</date>
			<biblScope unit="page" from="43" to="52" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">User modeling for adaptive news access</title>
		<author>
			<persName><forename type="first">D</forename><surname>Billsus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Pazzani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">User Modeling and User-Adapted Interaction</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="147" to="180" />
			<date type="published" when="2000-02">February 2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Hybrid recommender systems: Survey and experiments</title>
		<author>
			<persName><forename type="first">R</forename><surname>Burke</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2002-11">November 2002</date>
			<publisher>Kluwer Academic Publishers</publisher>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="331" to="370" />
			<pubPlace>Hingham, MA, USA</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">The adaptive web. chapter Hybrid web recommender systems</title>
		<author>
			<persName><forename type="first">R</forename><surname>Burke</surname></persName>
		</author>
		<imprint>
			<biblScope unit="page" from="377" to="408" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">State-of-the-Art Recommender Systems</title>
		<author>
			<persName><forename type="first">L</forename><surname>Candillier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Jack</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Fessant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Meyer</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
			<publisher>IGI Global</publisher>
			<biblScope unit="page" from="1" to="22" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A user-centric evaluation framework of recommender systems</title>
		<author>
			<persName><forename type="first">L</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Pu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ACM Conference on Recommender Systems (RecSys Š10), Workshop on User-Centric Evaluation of Recommender Systems and Their Interfaces (UCERSTI Š10)</title>
				<meeting><address><addrLine>Barcelona, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010">2010. Sept. 26-30, 2010</date>
			<biblScope unit="page" from="14" to="21" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Combining content-based and collaborative filters in an online newspaper</title>
		<author>
			<persName><forename type="first">M</forename><surname>Claypool</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gokhale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Miranda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Murnikov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Netes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Sartin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of ACM SIGIR Workshop on Recommender Systems</title>
				<meeting>ACM SIGIR Workshop on Recommender Systems</meeting>
		<imprint>
			<date type="published" when="1999-08">August 1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Performance of recommender algorithms on top-n recommendation tasks</title>
		<author>
			<persName><forename type="first">P</forename><surname>Cremonesi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Koren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Turrin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">RecSys</title>
				<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="39" to="46" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Analysis of cold-start recommendations in iptv systems</title>
		<author>
			<persName><forename type="first">P</forename><surname>Cremonesi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Turrin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">RecSys &apos;09: Proceedings of the 2009 ACM conference on RecommenderSystems</title>
				<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="1" to="4" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Time-evolution of iptv recommender systems</title>
		<author>
			<persName><forename type="first">P</forename><surname>Cremonesi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Turrin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the 8th European Conference on Interactive TV and Video</title>
				<meeting>of the 8th European Conference on Interactive TV and Video<address><addrLine>Tempere, Finland</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2010-06">June 2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Decision trees for business intelligence and data mining: using SAS enterprise miner</title>
		<author>
			<persName><forename type="first">B</forename><surname>Ville</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006">2006</date>
			<publisher>SAS Publishing</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Item-based top-n recommendation algorithms</title>
		<author>
			<persName><forename type="first">M</forename><surname>Deshpande</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Karypis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Information Systems (TOIS)</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="143" to="177" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title level="m" type="main">Information retrieval using a singular value decomposition model of latent semantic structure</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">W</forename><surname>Furnas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Deerwester</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">T</forename><surname>Dumais</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">K</forename><surname>Landauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">A</forename><surname>Harshman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Streeter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">E</forename><surname>Lochbaum</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1988">1988</date>
			<publisher>ACM Press</publisher>
			<biblScope unit="page" from="465" to="480" />
			<pubPlace>New York, NY, USA</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">A scalable, accurate hybrid recommender system</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Ghazanfar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Prugel-Bennett</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2010 Third International Conference on Knowledge Discovery and Data Mining, WKDD &apos;10</title>
				<meeting>the 2010 Third International Conference on Knowledge Discovery and Data Mining, WKDD &apos;10<address><addrLine>Washington, DC, USA</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE Computer Society</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="94" to="98" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Evaluating collaborative filtering recommender systems</title>
		<author>
			<persName><forename type="first">J</forename><surname>Herlocker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Konstan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Terveen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Riedl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Information Systems (TOIS)</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="5" to="53" />
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">On the use of singular value decomposition for text retrieval</title>
		<author>
			<persName><forename type="first">P</forename><surname>Husbands</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Simon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ding</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2000-10">Oct. 2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Evaluation of item-based top-n recommendation algorithms</title>
		<author>
			<persName><forename type="first">G</forename><surname>Karypis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the tenth international conference on Information and knowledge management, CIKM &apos;01</title>
				<meeting>the tenth international conference on Information and knowledge management, CIKM &apos;01<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="247" to="254" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">On combining classifiers. Pattern Analysis and Machine Intelligence</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kittler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hatef</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Duin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Matas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="226" to="239" />
			<date type="published" when="1998-03">mar 1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Factorization meets the neighborhood: a multifaceted collaborative filtering model</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Koren</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">KDD &apos;08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining</title>
				<meeting><address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="426" to="434" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Content-based Recommender Systems: State of the Art and Trends</title>
		<author>
			<persName><forename type="first">P</forename><surname>Lops</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gemmis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Semeraro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Recommender Systems Handbook</title>
				<editor>
			<persName><forename type="first">F</forename><surname>Ricci</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Rokach</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Shapira</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><forename type="middle">B</forename><surname>Kantor</surname></persName>
		</editor>
		<meeting><address><addrLine>Boston, MA</addrLine></address></meeting>
		<imprint>
			<publisher>Springer US</publisher>
			<date type="published" when="2011">2011</date>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="73" to="105" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Content-boosted collaborative filtering for improved recommendations</title>
		<author>
			<persName><forename type="first">P</forename><surname>Melville</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">J</forename><surname>Mooney</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Nagarajan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Eighteenth national conference on Artificial intelligence</title>
				<meeting><address><addrLine>Menlo Park, CA, USA</addrLine></address></meeting>
		<imprint>
			<publisher>American Association for Artificial Intelligence</publisher>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="187" to="192" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Semantically Enhanced Collaborative Filtering on the Web</title>
		<author>
			<persName><forename type="first">B</forename><surname>Mobasher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 1st European Web Mining Forum (EWMF2003)</title>
				<meeting>the 1st European Web Mining Forum (EWMF2003)</meeting>
		<imprint>
			<date type="published" when="2003-09">Sept. 2003</date>
			<biblScope unit="page" from="57" to="76" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Recommending content for itv: what the users really want?</title>
		<author>
			<persName><forename type="first">R</forename><surname>Navarro-Prieto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rebaque-Rivas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hernández-Pablo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 8th international interactive conference on Interactive TV&amp;#38;Video, EuroITV &apos;10</title>
				<meeting>the 8th international interactive conference on Interactive TV&amp;#38;Video, EuroITV &apos;10<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="123" to="126" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Comparative evaluation of recommender system quality</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">N A P R T</forename><surname>Paolo Cremonesi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Franca</forename><surname>Garzotto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CHI extended abstract on Human factors in computing systems</title>
				<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
	<note>ACM -to appear</note>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">A framework for collaborative, content-based and demographic filtering</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Pazzani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artif. Intell. Rev</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page" from="393" to="408" />
			<date type="published" when="1999-12">December 1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<monogr>
		<title level="m" type="main">Automatic text processing</title>
		<author>
			<persName><forename type="first">G</forename><surname>Salton</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1988">1988</date>
			<publisher>Addison-Wesley Longman Publishing Co., Inc</publisher>
			<pubPlace>Boston, MA, USA</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Item-based collaborative filtering recommendation algorithms</title>
		<author>
			<persName><forename type="first">B</forename><surname>Sarwar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Karypis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Konstan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Reidl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">10th Int. Conf. on World Wide Web</title>
				<imprint>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="285" to="295" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Incremental singular value decomposition algorithms for highly scalable recommender systems</title>
		<author>
			<persName><forename type="first">B</forename><surname>Sarwar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Karypis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Konstan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Riedl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">5th Int. Conf. on Computer and Information Technology (ICCIT 2002)</title>
				<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="399" to="404" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Generate models for cold-start recommendations</title>
		<author>
			<persName><forename type="first">A</forename><surname>Schein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Popescul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ungar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Pennock</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ACMSIGIR Workshop on RecommenderSystems</title>
				<imprint>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Methods and metrics for cold-start recommendations</title>
		<author>
			<persName><forename type="first">A</forename><surname>Schein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Popescul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ungar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Pennock</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002)</title>
				<meeting>the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002)</meeting>
		<imprint>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="253" to="260" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">A personalised tv listings service for the digital tv age</title>
		<author>
			<persName><forename type="first">B</forename><surname>Smyth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cotter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowl.-Based Syst</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="issue">2-3</biblScope>
			<biblScope unit="page" from="53" to="59" />
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">A personalized tv guide system compliant with mhp</title>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yuan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Consumer Electronics</title>
		<imprint>
			<biblScope unit="volume">51</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="731" to="737" />
			<date type="published" when="2005-05">May 2005</date>
		</imprint>
	</monogr>
	<note>IEEE Transactions on</note>
</biblStruct>

<biblStruct xml:id="b35">
	<monogr>
		<title level="m" type="main">Chapter 5 tv personalization system design of a tv show recommender engine and interface</title>
		<author>
			<persName><forename type="first">J</forename><surname>Zimmerman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kurapati</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">L</forename><surname>Buczak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Schaffer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gutta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Martino</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
