=Paper= {{Paper |id=Vol-1542/paper3 |storemode=property |title=Survey of User Profiling in News Recommender Systems |pdfUrl=https://ceur-ws.org/Vol-1542/paper3.pdf |volume=Vol-1542 |authors=Mahboobeh Harandi,Jon Atle Gulla |dblpUrl=https://dblp.org/rec/conf/recsys/HarandiG15 }} ==Survey of User Profiling in News Recommender Systems== https://ceur-ws.org/Vol-1542/paper3.pdf
organized as follows: In section 2 news recommender systems and          The content of the user profile for this kind of recommender
their details and challenges are described. In section 3 different       system which has not very structured format is different from
dimensions of user profiles and machine learning techniques are          others. In order to have an exact and practical model of the user
explained. Features of the user profiles with respect each               profiles, the system needs to know the behavior of the user
technique of learning are summarized. In section 4, applying the         including background, interest and goals. These features are
filtering techniques for content-based, collaborative and different      changing over time, so considering the temporal parameters such
kinds of hybrid system is discussed. The classification of machine       as time and location is crucial [3].
learning techniques and their addressed problems are illustrated,            There are three major presentations of terms in the user
before the conclusions are presented in Section 5.                       profile. The first approach is presenting terms as vectors in a
                                                                         vector space model. In order to weigh correctly every single word
                                                                         based on its frequency in every document and in the collection of
                                                                         documents, TF-IDF is often applied. This measure puts more
                                                                         emphasis on one word that appears frequently in one specific
                                                                         document and not in other ones. So it will gain more weight and
                                                                         appointed document, will be retrieved to a target user. But the
2. News Recommender Systems                                              problem of polysemy (multiple meaning for one word) and
News recommender systems share many features with                        synonymy (multiple words for identical meaning) remain. The
information retrieval systems and human computer interaction as          desired approach reflects cultural and linguistic knowledge of
well. Text mining techniques for large scale data sets are needed,       terms and also could use reasoning on their content. As a result,
and machine learning methods are employed when learning cycles           the presentation is more intelligent and is not a simple bag of
can be built into the systems. In general there are three steps. First   words and could provide the knowledge about desired terms [1].
of all, data pre-processing such as sampling, dimension reduction,       The second one is the analysis words in the format of entity. They
denoising with use of similarity functions are normally applied.         have meanings and relations, but they suffer from generalization
Then the text is analyzed through supervised or unsupervised             or specialization since there is no hierarchical relationships among
machine learning techniques depending on availability of training        the entities [3]. The third one is the semantic analysis that is
data sets. At the end the result is interpreted through for example      ontology-based. It has hierarchical relationships between the
the F1- measure, ROC or MAE [1].                                         semantic concepts modeling user interests. The terms that indicate
                                                                         the user interests including their interests that last longer or the
If we consider news recommender system as search engines, the
                                                                         ones that appear only for a short time could be enriched by
user profiles can be regarded as long search queries. The system
                                                                         semantic approaches. The advantage of providing ontologies for
ranks the results on the basis of well the profile matches the
                                                                         the user interests is that all the terms or entities are in hierarchical
descriptions of the news articles. Formally, the appropriateness of
                                                                         relationships which give more specific detail of user interests at
recommended news to the user can described by the following
                                                                         the side of the general ones [3]. The semantic enrichment could
utility function [1]:
                                                                         benefit from encyclopedic knowledge beside the knowledge of
                                                                         applied documents. So the terms are semantic vectors in word
This function assigns a score r for each combination of user c and       space model [1]. Each of them are indexed by their weights but
news story s. Matrix indicates the characteristics of the user and       later will be interpreted semantically by using Wikipedia. It is
  shows the different specifications of available articles such as       called Explicit Semantic Analysis (ESA) [4].
topic, location, news agency, date and other useful attributes. All      The feedback of the user is the other approach of user modeling.
different algorithms in recommender systems try to maximize the          In general s/he could communicate and provide their interest
result matrix. Each entry of could be any non negative internal          towards the news explicitly or implicitly. Explicit feedback is to
between 0 and 1 or 0 and 100 based on the system definition. At          provide their interest (disaffection) directly to the system. It could
the end, an article that maximizes the utility function will be          be actions such as rating, like or filling the survey through the
recommended [1]:                                                         interface of the application. Implicit feedback includes the
                                                                         interactions such as click on articles (touch in mobile device),
                                                                         scrolling articles using a mouse or a keyboard (swapping in the
News recommender systems differ in the context of items                  mobile device), printing or saving articles, copying and posting a
structures from other recommenders. The structure of news                part or all of articles, reading articles, forwarding or sharing the
articles is not following any specific format. There are many news       articles and providing the qualitative comments on the article.
articles in a day that have very short life spans while the system       Recommender systems are highly dependent on user feedback. As
must scale to deal with huge volumes of data. Besides, the news          long as the user interacts with the application, the accuracy of the
recommender system must always recommend interesting articles            system may gradually improve. Explicit feedback tends to
to the user, though it should not make over-specialize for the           produce more exact user profiles than what is possible with
target user. [2]                                                         implicit feedback. Unfortunately, not all users are willing to spend
                                                                         time to provide such feedback, so the implicit signals of the users
3. User Profiles                                                         are normally the basis of the recommendation [5].
The desired user profiles need to have a changing essence and                Specify ying the ty ype of user’s interest could help the sy           ystem to
flexible content. These profiles show their preferences towards          cover all domains of their attention. The long-term interest is
news articles by modeling the interesting articles. Besides, storing     more dependent on the user profession and the personal
user interactions is a basis to know their favorite topics which last    background than what will be traced by the log history. But the
longer and which are only for a short period of time.                    short-term interest is the one mostly related to the current trend of
This model consists of meta-data such as time and location, which        public that s/he has communication with. Although depending on
is changing according to the user behavior.                              the goals, the long-term interest will change gradually. Besides,
supervising the context of user’s attention could provide good                    features are more than instances, linear kernel is good
evidence to capture the short-term interest and update their long-                enough to be applied [16, 17].
term interest time by time. In [6] by defining running context over              Probabilistic methods and Naive Bayes: Bayesian Belief
category and topic, the current user‘s interest is captured. The old              Network with conditional independency is the most
user profile that is the indicator of their long-term interest is                 applicable one. Multinomial (Bernoulli) and
updated progressively if there is nothing in common with their                    multivariate are two types of Naive Bayes. While in the
current focus. Besides, there should be a balanced focus on the old               Bernoulli model absence or presence of a model is
and new user profile. While keeping the old user profile and over                 checked, in multivariate one the number of occurrences
looking the context results in dissatisfaction, giving too much                   of a term will be calculated [18, 19].
priority to the current context will not cover the news articles that            Neural Network: Single layer perceptron and multi layer
are related to their background and are the basis of their interest.              for non linear separable problems are the samples of
In addition, different time of day (morning, evening) and week                    applied neural network in the recommender systems
(weekdays and weekend) could affect the user profile [7].                         [20].
Considering the topic of the news articles, target users may have       Below is the list of unsupervised learning techniques:
different desires at different times. As an example, s/he might                  Probabilistic methods: If the structure of Bayesian
have more interests in politics and economics in weekdays and                     network is not known then building the DAG Bayesian
focus more on lifestyle news in the weekend [8].                                  with scoring function, constraint based learning or
While personalizing the news is desirable, the importance of                      Conditional Independency can be applied. The last one
public trend is not negligible. In [9] based on the frequency of                  has more efficiency [21]. The other techniques such as
user clicks, public trend could provide the interesting news                      Bayesian Hidden Score (pairwise learning) and graph-
articles as well. If there are not enough clicks from the user side,              based learning have been applied in [22].
then according to their location, public trend of that location is a             Neural Network: Self Organizing Map (Kohonen) and
good indicator to recommend the news. This dimension of the                       Restricted Boltzmann Machine belong to the category
user profile that specifies the location has a key role in                        of unsupervised learning [20].
recommending news articles. Short-term interests of the user are                 Clustering: flat clustering by k-means algorithm deals
highly dependent on their location. Location could capture public                 with the categorical data and the most frequent term will
trend and find similar networks of users as well. Sometimes                       be the centroid. In the hierarchical clustering, the other
ignoring the user profile and focus on the context is helpful (in                 type of clustering, divisive is more accurate than
economical news, user profile is not very helpful but the context                 agglomerative. There are two approaches to label
tracing is more informative), while other times it is better to count             clusters. The first one is differential that through feature
only on the user profile (for entertainment section user profile                  selection a label with a higher score will be chosen. The
enrichment is much better than context) [10].                                     second one is inter clustering that the closest one to the
As the amount of data explodes, the importance of extracting                      title or the higher weight to the centroid of the cluster
models and predicting unseen data with machine learning                           will be chosen as the label. The drawback of cluster-
techniques is increasing [11]. There are two major types of                       internal labeling is disability to distinguish between
learning techniques, supervised and unsupervised. In the former                   words which are frequent in the whole clusters and the
one, an annotated training dataset is provided, whereas in the                    ones that are frequent only in one specific cluster.
latter one, the machine explores the data to identify interesting                 Labeling in hierarchical clustering due to the dependent
patterns without training data. Below is the list of supervised                   definitions of parent, child and sibling is more
learning techniques used in recommender systems:                                  complicated [16].
                                                                        Table 1 shows the applied machine learning techniques to build
         Decision Trees (C4.5 or KART) handle categorical-             up a user profile.
          nominal and heterogeneous data. It is also able to cope
          with missing values. Through pre pruning, overfitting         4. Applying User Profiles in Recommender
          will be addressed. It tends to work well with small sized     Systems
          datasets, though the cost of decisions on continuous data
                                                                        There are different approaches to filter out the information.
          streams is high [11, 12].
                                                                        Content-based and collaborative filtering are the most applicable
         Rule-based (RIPPER) can handle multi value features           ones. In content-based filtering, the concept of news articles will
          very well. It is decision tree-based and uses rules to        be analyzed. Then according to the content of the user profile (i.e.
          categorize new items. It utilizes post pruning to find the    characteristic of read articles), similar articles are predicted and
          best fit for the rule set [13].                               presented to the user. In the content-based filtering, the utility
         K Nearest Neighbor (KNN) can handle continuous data           function is:
          through Euclidean, Manhattan or Minkowski distance
          and cope with categorical data through Hamming
          distance. It is a lazy learner that works well with few
          instances [14, 15].
         Rocchio and Relevance Feedback: the user profile is           If each of the content of the user profile and item profile is
          regarded as a query [16] and based on the implicit            represented by TF-IDF weight, then the scoring function could be
          feedback of user, the recommendation will be improved         calculated through cosine similarity of vectors of the weight. To
          in time.                                                      achieve the accurate prediction, attributes of news articles that
         Support Vector Machine (SVM): through SVM                     have been counted on, are important. Since the nature of news
          reduction of sensitivity to the noises and increasing         article is unstructured, extracting relevant and important features
          generalization is done. For non linear problem if             has a key role in content-based filtering. If the articles are
categorized with minimum misclassification error, then storing              (Bernoulli Model) is applied for modeling user behavior, the
interesting news articles in the user profile is much easier and            output is binary as it is considering absence or presence of terms
consequently, recommendations are of higher quality. Bayesian               regardless of their conditional independency [1]. It can suggest the
Networks can be utilized well for learning user profiles based on           new item to the target user by comparing the new item’s
the articles that have been read. It can model profiles of the users        characteristics to the terms in the user’s profile. But if there is not
through ignoring missing data and considering conditional                   enough attributes, content-based filtering is normally not the most
dependency in one specific category of news articles. It can                efficient one. If the user is new to the system it cannot recommend
provide probabilities of each attribute of article by its nodes. The        anything as there is no content of their profile available. Besides,
modeled domain includes continuous data. Then similarity of the             it causes lack of serendipity due to providing too many similar
user profile based on predicted attributes of article and available         news articles to the user. Considering the collaborative approach
news articles is computed and the ones with the highest score will          for filtering information, there are two different models, memory-
be recommended. If another technique such as Naive Bayes                    based and model-based. Memory-based utilizes the log




                                            Table 1. ML techniques and features of user profiles
      ML Techniques                                                       User Profile Features
 Decision Tree (C4.5)          Semantic enrichment can be handled at entity level, but in the beginning of building the user profile or for
                                                               capturing short-term interest [13, 23].

            Rule-based         Semantic enrichment can be handled at entity level. More interesting categories of news may be predicated
             (RIPPER)                                                     through rules [1].
                  KNN                   Captures the short term interest of user and popularity of the item among a group of user.
         Rocchio and         User profiles are regarded as queries, the system improves over time from relevance feedback of the user [16].
  Relevance Feedback
        Support Vector                             It outperforms KNN,C4.5 and Rocchio [16] with the Reuters dataset
             Machine
        Probabilistic         Bernoulli works well with small sizes of data set and multinomial works well in large sizes of datasets. DAG
   methods and Naive         captures the dependency of items in more detailed capturing interest, vigorous towards missing data and could
               Bayes                                                      disregard noisy data.
                                                       BHS and graph-based capture online interest of the user [22]
       Neural Network               It can represent details of the user’s interest through deep learning of three layer perceptron [24].
             Clustering          The content of the items are clustered and then item-based collaborative is implemented on the output.
                                                                    Fuzzy membership over the k-means.
                                                Similarity of the item-rating matrix, the group-rating matrix (MovieLense)
                                  Hierarchical clustering for the news groups (LDA for small dataset and PLSI for large dataset) [25]
                                                                              similar news articles that are interesting to one specific user) is
history of all users and put top-N similar users who have the                 not feasible due to lack of labeled data in the training phase,
same taste about the news articles into one specific group.                   clustering of news or users could be a practical solution. With
                                                                              the Google News dataset, clustering is done on the basis of
Then to provide the latest and interesting news articles to the               users’ clicks on different news article. Through clustering, latent
target user, it filters out users with the same interest and                  factors (latent semantic analysis) can be revealed. Consequently,
recommends the new articles that have been read by them. It is                ignoring the hidden values will result in a very poor accuracy. It
working with a matrix of user’s profile and all the news articles.            could be helpful to distinguish hidden variables through the
It is possible to apply K Nearest Neighbor (through                           clustering and provide more accurate prediction of news articles
neighborhood measurement) to find the closest users to the                    [23]. One the technique to implement this approach is building
current active user. The other approach is applying similarity                up the matrix of users and item as matrix factorization. The
measurement like cosine similarity or Pearson correlation, which              matrix of users and news articles is suffering from sparsity,
provide the new item for the target user if it has similarity with            since there are several positions that users do not provide any
previous chosen items. It can help us find similar users or items             feedback. To find the hidden variables that affect the
regarding to the context of memory [23].                                      recommendation as well, UV decomposition (it is one instance
The other type of collaborative filtering is model-based. It is               of Singular Value Decomposition) is possible to be applied. If
more scalable and much faster than memory based collaborative                 the utility matrix      is          ( indicates the user and
filtering. Through this type of filtering not all the dataset will be         indicates the news articles), then UV decomposes it
traced and investigated, but only some information will be                    multiplication of two different matrixes including
modeled. As finding the similarity between users or news                      and         :
articles (users with the same interest in the specific news or two
                                                                             combination. In the movie recommender domain [29], the
RMSE is a common tool to measure the accuracy of prediction                  RIPPER algorithm is implemented with item features and users
blank entries in considering the product .                                   rating.

Although it is working much faster than memory based, it is less             There are three other models of hybrid systems that are ordered
exact than it. In spite of all the applicable different approaches           by their intrinsic structure:
of collaborative filtering, it cannot make the accurate prediction               Feature augmentation: One of the filtering techniques is
for the new user or the new item (cold-start problem). The core              applied to compute rating scores or to classify items. The output
of all the algorithms is dependent on the group of users (or                 of this filtering is the input for the other filtering technique. In
items) in order to find the proper match for the target user.                Libra system, content-based filtering through Naïve Bayes is
Consequently it has nothing to present to the user with unique               done on data that comes from Amazon. The data from Amazon
taste.                                                                       that show related authors and titles were implemented using
As each of these filtering techniques has its own problems and               collaborative filtering. Collaborative filtering is done first.
challenges in recommender systems, a hybrid system is often                      Meta-level: It provides a model through one of the filtering
preferred. It takes into account both filtering in predefined step           methods as an input for the other one. The model is the complete
and could overcome drawback of each. Considering two                         one, not a learned model like feature augmented techniques. In
techniques of filtering (content-based and collaborative), the               Fab [30] at first by means of relevance feedback and the
order of combination of them might be important to build a                   Rocchio algorithm, collections of items (the need of users in
hybrid system. Although in some techniques of hybridization,                 mass of dataset in web) are composed (content-based). K-
the order is not a matter. The techniques that order is not                  nearest neighbor is then used with collaborative filtering to
important are [26]:                                                          complete the recommendations. Meta-level is the only ordered
    Mixed: the result from both techniques will be presented in              technique that applies content-based filtering first.
one grouped or separate list. It has been utilized in [27] to                     Cascade: Approximately similar to the other ordered
provide the TV shows to the users. The mixed hybrid system                   techniques, it refines the result of candidates that have been
provides recommendations based on the characteristics of each                filtered by the previous technique. But if the items in the first
show and preferences of other users.                                         filtering have very low priorities, they will not be in the second
    Weighted: The score for each technique is computed, and                  filtering stage. In fact, the second filtering step is only applied to
the weighting of final score will be the basis for the                       provide more accurate recommendations and if an item has not
recommendation. In personalized Tango (P-tango) for online                   enough rating score, it will not be in the second phase. Fab [31]
newspapers, equal weights are assigned to both filtering                     is the example of this technique. With collaborative filtering on
techniques. Gradually each weight is increasing regarding the                the selection stage, the items are chosen with an exact score and
user rating. Based on the rating, the absolute error is computed             presented to the user.
and is decreasing through the better recommendation.                         According to the implemented hybrid systems in news
     Switching: This technique uses some criterion to switch                 recommender system (such as Daily Learner), switching schema
between filtering techniques and based on the specific chosen                is the most common strategy. It can start with content-based
filter, recommends the item. In the DailyLearner switching                   filtering and utilize Naive Bayes to categorize the news articles
hybrid system, content-based filtering with k nearest neighbor is            based on the content of the articles and apply item-based
first applied. If it does not produce sufficient recommendations,            collaborative filtering to calculate the similarity between the
collaborative filtering takes advantage of similar users’ interests          news articles and the user profile. On the other hand, it is also
to recommend desired items. In another system, item-based                    possible to apply collaborative filtering to find the closest users
collaborative filtering is triggered if the accuracy of the content-         to the active user (through KNN) and then with content-based
based filtering part is low [28].                                            filtering identify much more similar items based on the
                                                                             similarity computation of user profile and news articles.
    Feature combination: The technique takes advantage of
one filtering type such as collaborative filtering as feature allied         Table 2 shows the applied machine learning techniques to deal
with data. Then content-based filtering is applied. Through this             with the issues of news recommender systems.
kind of hybrid system, the absolute dependency on users is
dropped by applying collaborative filtering as a feature

                                       Table 2. Machine learning techniques and challenges addressed
  ML Techniques                                         Challenges addresses of news recommender system
   Decision Tree                                                   Capturing short term interest [1].
      (C4.5)
     Rule-based                                    Serendipity can be supported with new category reasoning [32].
     (RIPPER)
        KNN                            Short-term interests and provide the latest news to the user based on their interests [1].
    Rocchio and                                              Handling long-term interest of the user [1].
     Relevance
     Feedback
   Support Vector                           Sparse Problem and huge data after a long time usage of the application[33].
      Machine
    Probabilistic                                            Handling long-term interest of the user
    methods and                                                            Sparse problem
    Naive Bayes
                                                                             Noisy data
                                                                              Cold Start
                                                                Precious interest of the user [28].
  Neural Network                                                    Short term and long term [34].
                        Tied Boltzmann with residual parameter could outperform on non cold-start problem in comparison with simple
                          method of collaborative filtering, Pearson correlation for the items. It also is competitive with the cold-start
                                                          problem in content-based filtering. (Netflix)
                                                               Changing interest of the user [24].
     Clustering                                                               Cold start
                          Through fuzzy membership new and interesting news articles are possible to be represented to the user [25].


                                                                                  Systems, in In Proceedings of the 10th International
                                                                                  Conference on Web Information System and Technologies
5. Conclusion                                                                     April 2014: Barcelona.
The news recommender system is somewhat different from
                                                                             [7] Abel, F., et al., Analyzing user modeling on twitter for
other recommender systems. It is used to provide a variety of
                                                                                 personalized news recommendations, in Proceedings of the
personalized news articles that have very short life spans. In
                                                                                 19th international conference on User modeling, adaption,
addition the range of the user’s interests is wide and changing
                                                                                 and personalization. 2011, Springer-Verlag: Girona, Spain.
over time and contexts. These characteristics necessitate very
                                                                                 p. 1-12.
dynamic analyses of user profiles.
                                                                             [8] Adomavicius, G. and A. Tuzhilin, Context-aware
  In this paper the distinguishable characteristics that affect
                                                                                 recommender systems, in Proceedings of the 2008 ACM
recommendation strategies are assessed. The user feedback on
                                                                                 conference on Recommender systems. 2008, ACM:
recommended items is one of them. Different algorithms of
                                                                                 Lausanne, Switzerland. p. 335-336.
machine learning (that fall into the categories of supervised and
unsupervised) are discussed to build up user profiles. On the                [9] Liu, J., et al., Personalized news recommendation based on
other hand, as the user profile is dependent on the whole                        click behavior, in Proceedings of the 15th international
framework of filtering methods, the techniques are also studied.                 conference on Intelligent user interfaces. 2010, ACM:
They utilize user profiles in diverse ways which affect the                      Hong Kong, China. p. 31-40.
accuracy of the corresponding recommendations.                               [10] Bellogín, A., et al. Discovering Relevant Preferences in a
                                                                                  Personalised Recommender System using Machine
                                                                                  Learning Techniques. in Preference Learning Workshop
References                                                                        (PL 2008), at the 8th European Conference on Machine
[1] Ricci, F., et al., Recommender Systems Handbook. 2010:                        Learning and Principles and Practice of Knowledge
    Springer-Verlag New York, Inc. 842.                                           Discovery in Databases (ECML PKDD 2008). 2008.
[2] Özgöbek, Ö., J. A. Gulla,, R. C. Erdur, A Survey on                      [11] Witten, I.H., E. Frank, and M.A. Hall, Data Mining:
    Challenges and Methods in News Recommendation, in In                          Practical Machine Learning Tools and Techniques. 2011:
    Proceedings of the 10th International Conference on Web                       Morgan Kaufmann Publishers Inc. 664.
    Information System and Technologies April 2014:                          [12] VENKATADRI.M, L.C.R. A Comparative Study On
    Barcelona.                                                                    Decision Tree Classification Algorithms In Data Mining.
[3] Bouneffouf, D., Towards User Profile Modelling in                             2010; Available from:
    Recommender System. 2013.                                                     https://www.academia.edu/1374211/A_Comparative_Study
                                                                                  _On_Decision_Tree_Classification_Algorithms_In_Data_
[4] Gabrilovich, E. and S. Markovitch, Computing semantic
    relatedness using Wikipedia-based explicit semantic                           Mining.
    analysis, in Proceedings of the 20th international joint                 [13] Pazzani, M.J. and D. Billsus, Content-based
    conference on Artifical intelligence. 2007, Morgan                            recommendation systems, in The adaptive web, B. Peter, K.
    Kaufmann Publishers Inc.: Hyderabad, India. p. 1606-1611.                     Alfred, and N. Wolfgang, Editors. 2007, Springer-Verlag.
                                                                                  p. 325-341.
[5] Jon Atle Gulla, J.E.I., Arne Dag Fidjestøl, John Eirik
    Nilsen, Kent Robin Haugen, and Xioameng Su, Learning                     [14] Deokar, S., WEIGHTED K NEAREST NEIGHBOR. 2009.
    User Profiles in Mobile News Recommendation. Journal of                  [15] Webb, G., M. Pazzani, and D. Billsus, Machine Learning
    Print and Media Technology Research, September 2013.                          for User Modeling. User Modeling and User-Adapted
    Vol II, No. 3: p. pp. 183-194.                                                Interaction, 2001. 11(1-2): p. 19-29.
[6] Jon Atle Gulla, A.D.F., Xiaomeng Su and Humberto                         [16] Manning, C.D., et al., Introduction to Information
    Castejon, Implicit User Profiling in News Recommender                         Retrieval. 2008: Cambridge University Press. 496.
[17] Prügel-Bennett, M.A.G.a.A., Building Switching Hybrid         [32] Markward Britsch, N.G., Michael Schmelling, Application
     Recommender System Using Machine Learning Classifiers              of the rule-growing algorithm RIPPER to particle physics
     and Collaborative Filtering. IAENG International Journal           analysis. 2008.
     of Computer Science.                                          [33] Anatole Gershman, T.W., Eugene Fink, Jaime Carbonell,
[18] Margaritis, D., Learning Bayesian Network Model                    News Personalization using Support Vector Machines.
     Structure. 2003, University of Pittsburgh.                    [34] Kyo-Joong Oh, W.-J.L., Chae-Gyun Lim, Ho-Jin Choi,
[19] Barber, D., Bayesian Reasoning and Machine Learning.               Personalized News Recommendation using Classified
     2010.                                                              Keywords to Capture User Preference, in Advanced
[20] Peretto, P., An Introduction to the Modeling of Neural             Communication Technology (ICACT). 2014.
     Networks. 1992: Cambridge University Press.
[21] Kotsiantis, S.B., Supervised Machine Learning: A Review
     of Classification Techniques, in Proceedings of the 2007
     conference on Emerging Artificial Intelligence
     Applications in Computer Engineering: Real Word AI
     Systems with Applications in eHealth, HCI, Information
     Retrieval and Pervasive Technologies. 2007, IOS Press. p.
     3-24.
[22] Bian, J., et al., Exploiting User Preference for Online
     Learning in Web Content Optimization Systems. ACM
     Trans. Intell. Syst. Technol., 2014. 5(2): p. 1-23.
[23] Rajaraman, A. and J.D. Ullman, Mining of Massive
     Datasets. 2011: Cambridge University Press. 326.
[24] Gunawardana, A. and C. Meek, A unified approach to
     building hybrid recommender systems, in Proceedings of
     the third ACM conference on Recommender systems.
     2009, ACM: New York, New York, USA. p. 117-124.
[25] Mouton, C., Unsupervised Word Sense Induction from
     Multiple Semantic Spaces with Locality Sensitive Hashing,
     in International Conference RANLP. 2009.
[26] Burke, R., Hybrid Recommender Systems: Survey and
     Experiments. User Modeling and User-Adapted Interaction,
     2002. 12(4): p. 331-370.
[27] Cotter, P. and B. Smyth, PTV: Intelligent Personalised TV
     Guides, in Proceedings of the Seventeenth National
     Conference on Artificial Intelligence and Twelfth
     Conference on Innovative Applications of Artificial
     Intelligence. 2000, AAAI Press. p. 957-964.
[28] Ghazanfar, M.A. and A. Prugel-Bennett, An Improved
     Switching Hybrid Recommender System Using Naive
     Bayes Classifier and Collaborative Filtering. International
     Multiconference of Engineers and Computer Scientists
     (Imecs 2010), Vols I-Iii, 2010: p. 493-502.
[29] Basu, C., H. Hirsh, and W. Cohen, Recommendation as
     classification: using social and content-based information
     in recommendation, in Proceedings of the fifteenth
     national/tenth conference on Artificial
     intelligence/Innovative applications of artificial
     intelligence. 1998, American Association for Artificial
     Intelligence: Madison, Wisconsin, USA. p. 714-720.
[30] Balabanovi, M., #263, and Y. Shoham, Fab: content-based,
     collaborative recommendation. Commun. ACM, 1997.
     40(3): p. 66-72.
[31] Balabanovi, M. and #263, An adaptive Web page
     recommendation service, in Proceedings of the first
     international conference on Autonomous agents. 1997,
     ACM: Marina del Rey, California, USA. p. 378-385.