Representing Items as Word-Embedding Vectors and Generating Recommendations by Measuring their Linear Independence∗ Ludovico Boratto, Salvatore Carta, Gianni Fenu, and Roberto Saia Dipartimento di Matematica e Informatica, Università di Cagliari Via Ospedale 72, 09124 Cagliari - Italy ludovico.boratto@acm.org, {salvatore, fenu, roberto.saia}@unica.it ABSTRACT customers. In this context, the collaborative techniques, In order to generate effective results, it is essential for a rec- which represent a user with the ratings given to the items ommender system to model the information about the user she evaluated, are usually the most effective. Even though interests in a profile. Even though word embeddings (i.e., semantic technologies are moving at a very rapid pace and vector representations of textual descriptions) have proven state-of-the-art solutions, such as deep learning algorithms to be effective in many contexts, a content-based recom- able to extract word embeddings [1] from a text corpus, have mendation approach that employs them is still less effective been successfully employed in numerous information filter- than collaborative strategies (e.g., SVD). In order to over- ing and retrieval tasks, at the moment collaborative filtering come this issue, we introduce a novel criterion to evaluate approaches continue to be more accurate at generating rec- the word-embedding representation of the items a user rated. ommendations [2]. The extraction of the word embeddings The proposed approach defines a vector space in which the from a corpus is made possible thanks to several state-of- similarity between an unevaluated item and those in a user the-art tools, which create a vector representation of each profile is measured in terms of linear independence. Ex- word (Google’s word2vec1 ) or document (a word2vec exten- periments show its effectiveness to perform a better rank- sion, usually known as doc2vec). ing of the items, w.r.t. collaborative filtering, both when A user profile represented by a unique vector of features compared to a latent-factor-based approach (SVD) and to a allows a system to perform quick comparisons, e.g., in a classic neighborhood user-based system. content-based system, it can be easily compared to the vec- tor that represents an item with a simple metric, like the co- sine similarity. The idea behind this paper is to represent a CCS Concepts user profile as a matrix of word-embeddings, where each row •Information systems → Data mining; Recommender is represented by an item a user positively evaluated, and to systems; Learning to rank; define a metric able to evaluate the correlation between an item not rated by a user and those in her matrix-based user Keywords profile, in terms of linear independence. We observe that if the vector representation of an unevalu- Semantic Analysis; Word Embeddings; Algorithms; Metrics. ated item is linearly dependent to the items in a user profile that have been positively evaluated, their features match, 1. INTRODUCTION thus it is similar to the user preferences. This leads to a Recommender systems are essential for e-commerce com- higher accuracy of a recommender system w.r.t. collabora- panies, to filter the huge amounts of items they can provide tive approaches. and improve the quality and efficiency of the sales crite- ria [3]. In order to perform this task, these systems need to 2. APPROACH define a set of profiles that model the preferences of their Here, we present the steps our approach performs: ∗This work is partially funded by Regione Sardegna Item Vectorization. Given a set I of items, we first under project NOMAD (Next generation Open Mobile define and train a model by using the doc2vec neural net Apps Development), through PIA - Pacchetti Integrati di (by using as source the textual description of the items in Agevolazione “Industria Artigianato e Servizi” (annualità I). The result is the vector representation of the items in 2013). I, where the cardinality of each vector (item) depends on the number L of features used in the doc2vec vectorization process. Given a user, the output of this step will be a matrix that contains the embeddings of the items positively evaluated by her (denoted as Iu ), plus an empty row to employ during the filtering process to evaluate the items not yet evaluated by the user. Linear Independence Rate. To evaluate the similarity between the items in the matrix-based user profile and an RecSys 2016. September 15–19, 2016, Boston, MA, USA. 1 Copyright held by the author(s). http://deeplearning4j.org/word2vec Range 01 ÷ 20 unevaluated one, we define the Linear Independence Rate (LIR) coefficient. It is the average of the determinants of all square sub-matrices, defined by decomposing the user 62.02 % profile matrix in square sub-matrices of size |Iu | × |Iu | (with |Iu | ≤ L, otherwise we have only a matrix of size L × L). We calculate the LIR value by moving on the entire user pro- 01.63 % file matrix, extracting the determinant of each sub-matrix, Range 61 ÷ 80 09.67 % without overlaps. We can note that the maximum size of the square sub-matrices is the cardinality of the vectors, i.e., the 26.68 % Range 41 ÷ 60 L parameter used to build the doc2vec model. Through this compositional process, we evaluate the LIR of an item by placing its vector representation as last element of the user Range 21 ÷ 40 profile. We consider closer to the preferences of a user the items with a LIR value as close as possible to zero. 0.15 CF M RR SVD Ranking Algorithm. The Algorithm 1 takes as input 0.10 0.05 LIR the items I, a user u, the items Iu she evaluated, and the number of features L in the vectors created by doc2vec (i.e., (75) (150) (225) (300) the layerSize parameter). It returns as output a list R of the T ested items (×1000) items not evaluated by the user u (ranked by LIR value). Algorithm 1 Items evaluation and ranking Figure 1: (Top) Linear similarity evaluation. (Bottom) Ranking accuracy evaluation Input: I=Set of items, u=User, Iu =Items of u, L=layerSize Output: R = List of ranked items 1: procedure GetRankedItems(I,u,Iu ,L) 2: if |Iu | > L then Iu =GetLastLItems(Iu , L) The second experiment evaluates the metric capability to 3: end if infer the future choices of the users (Figure 1: Bottom), by 4: V =Doc2VecVectorization(I) comparing the rank assigned to the items in the test set by 5: M =DefineUserProfileMatrix(V ,Iu ) 6: M =AddEmptyVectorAsLastRow(M ); using the LIR metric, with those assigned to the same items 7: for each i in I do by the other state-of-the-art approaches taken into account. 8: if i NOT IN Iu then The t-tests highlighted a statistical difference between the 9: v=GetItemVector(V, i) 10: M =FillLastMatrixRow(M, v) results (p < 0.05). 11: LIR=CalculateLIR(M ) 12: R ← (i, LIR) 13: end if 4. DISCUSSION AND CONCLUSIONS 14: end for The experimental results show the effectiveness of the 15: Return SortItemsByDescLIR(|R|) 16: end procedure compositional approach used by our LIR metric, as well as its capability to overcome the canonical state-of-the-art metrics, in terms of modeling of the user preferences. In- It should be noted that our approach is scalable by em- deed, we have a strong improvement in the process of rating ploying distributed computing models (e.g., MapReduce). of the unevaluated items, and this means that our metric as- Indeed, the computation of the LIR metric for each un- signs an higher score (w.r.t. the state-of-the-art approaches evaluated item can be distributed over different machines. to which we compared) to the items positively evaluated by a user. The use of the LIR metric can be also extended to 3. EVALUATION other contexts that do not use word embeddings, e.g., those based on a canonical term-document matrix. We compared our approach with two collaborative filter- Future work. We will consider our metric to evaluate ing approaches: CF , which is based on a classic neighbor- the items negatively evaluated by the users, in order to ob- hood model, and SV D, based on the latent factor model. tain additional information in terms of unpreferred items, The experiments have been performed by using a dataset exploiting it to improve the recommendations accuracy, e.g., that represents a standard benchmark for recommender sys- by verifying the preferences collision (i.e., very similar items tems, i.e., Movielens 1M, composed by 6,040 users, 3,900 rated both positively and negatively by a user). items, and 1,000,209 ratings. The criterion adopted for obtaining the training and the test sets was the K-fold cross validation with K = 3, and 5. REFERENCES the independent-samples two-tailed Student’s t-tests. The [1] R. Fu, J. Guo, B. Qin, W. Che, H. Wang, and T. Liu. adopted metric is the Mean Reciprocal Rank (M RR), a sta- Learning semantic hierarchies via word embeddings. In tistical measure able to evaluate the ranking generated for ACL (1), pages 1199–1209, 2014. a set of elements that belong to a certain domain. [2] C. Musto, G. Semeraro, M. de Gemmis, and P. Lops. The first experiment evaluates the metric capability to Learning word embeddings from wikipedia for measure the similarity between a user profile and an item, content-based recommender systems. In Advances in in terms of linear independence (Figure 1: Top). On the Information Retrieval, pages 729–734. Springer, 2016. basis of our approach, we rank all the items in decreasing [3] S. Sivapalan, A. Sadeghian, H. Rahnama, and A. M. order, by verifying that almost all of those in the test set Madni. Recommender systems in e-commerce. In World have been placed in the top positions (i.e., 1 ÷ 20), proving Automation Congress (WAC), 2014, pages 179–184. the LIR capability to effectively rank the items. IEEE, 2014.