=Paper=
{{Paper
|id=Vol-528/paper-10
|storemode=property
|title=Exploiting Semantic Web Technologies for Recommender Systems: A Multi View Recommendation Engine
|pdfUrl=https://ceur-ws.org/Vol-528/paper10.pdf
|volume=Vol-528
|dblpUrl=https://dblp.org/rec/conf/ijcai/Oufaida09
}}
==Exploiting Semantic Web Technologies for Recommender Systems: A Multi View Recommendation Engine==
Exploiting Semantic Web Technologies for Recommender Systems
A Multi View Recommendation Engine
Houda OUFAIDA, Omar NOUALI
DTISI Laboratory, CERIST Research Center
03, Rue frères Aissou - Ben Aknoun – Algiers, Algeria
{houfaida, onouali}@mail.cerist.dz
Abstract books, CDs and different other items. MovieLens and
Collaborative filtering systems are probably the most known Netflix for recommending movies and DVDs…
recommendation techniques in the recommender systems Recently, a new generation called semantic and social
field. They have been deployed in many commercial and recommender systems have emerged taking advantage of
academic applications. However, these systems still have the advancements in the semantic web technologies and
some limitations such as cold start and sparsty problems. features such as ontologies, taxonomies, social networks,
Recently, exploiting semantic web technologies such as tagging.
social recommendations and semantic resources have been In this paper, we introduce a multi view recommender
investigated. We propose a multi view recommendation system that includes collaborative, social and semantic
engine integrating, in addition of the collaborative
views of the user’s profile. Each view recommends a set of
recommendations, social and semantic recommendations.
Three different hybridization strategies to combine different items. Hence, three hybridization strategies are proposed
types of recommendations are also proposed. Finally, an for recommendations re-ranking. Finally, results from our
empirical study was conducted to verify our proposition. experimentations are presented.
The rest of the paper is organized as follows: First we
present the introduction of new Web 2.0 aspects in
Introduction recommender systems. Then we expose our multi view
recommender system, we present user’s multi view
Dealing with information overload is one of the most representation and then present three recommendation
challenging problems in the information access field; the modules: collaborative, social and semantic matching,
Web is a perfect example. Unlike retrieval systems hybridization strategies are also exposed. Finally, we
(Google, AltaVista, Yahoo, ….) which succeed in selecting discuss our experimental results and conclude with a
suitable items according to a specific user query, these summary of conclusions and outlooks.
items are the same for every user in every situation,
recommender systems aim to make personalized
recommendation to users according to their preferences, Related Work
tastes and interests expressed by users themselves or
learned by the recommender system over the time. The key for an efficient recommender system is better
There has been much work in this research area, from understanding of both users and items. However,
the early 1990 and still remains up to now. Foltz and traditional recommender systems consider limited data
Dumais experiences (Foltz and Dumais 1992) on four (ratings, keywords) to compute predictions and do not take
recommendation techniques have shown ambitious results, into account different factors necessary to understand
Resnick and collaborators proposed one of the first and reasons behind a user’s judgment; is it the item’s content,
probably the most known recommender system in the quality, is it because a friend recommended it?…
literature; Grouplens (Resnick et al. 1994) which Consequently, the users’ classic communities’ reflects only
recommends films to users according to their previous a global similarity usually insufficient to describe relations
ratings. connecting users and even more items.
Since, several models were proposed in the literature With the emergence of the Web 2.0, advancements
and much more applications were developed in the allowed the apparition of a new generation of
industry. Examples of such applications include e- recommender systems: semantic and social recommender
commerce websites like Amazon.com for recommending systems.
The availability of large product taxonomies on the Web
(UNSPSC, Amazon.com, ODP for example) has
encouraged the use of a taxonomy based user’s/item’s
Copyright © 2009, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved. description in recommender systems. Quickstep
(Middleton, Shadbolt, and De Roure 2004) used a paper
topic ontology, AKT-ontology, to extract weighted User × Item rating matrix and the User × User trust matrix
ontology topics as user’s profile. (Lops, Degemmis, and and produces as an output a predicted User × Item rating
Semeraro 2007) implemented k-means clustering matrix less sparse from the original one. Such method is
algorithm for neighborhood generation based on semantic particularly beneficial in new user recommendations
similarities between users. Each user’s profile contains two
semantic vectors; positive and negative weighted concepts
extracted from Wordnet lexical database. Proposed Approach
Mobasher and collaborators (Mobasher, Jin and Zhou
2004) propose an enhanced similarity measure which Seeking on greater understanding of user’s choices and
combine two measures; a semantic items’ similarity and judgments, we propose a novel approach which introduces
the classical rating similarity in a linear combination to social and semantic levels into the recommendation
perform recommendations. Moreover, (Wang and Kong process beyond the collaborative level. Hence combining
2007) calculate three similarity measures: collaborative, collaborative recommendations with social and semantic
semantic and demographic similarities. An offline ones is the key idea of our proposal.
clustering algorithm is applied to reduce computation Semantic Resources
complexity. User Ontologies, Taxonomies.
Registration Profile
Another promising aspect of the semantic Web is the Acquisition Concepts’
items’ tagging (Flickr, del.icio.us). Karen and collaborators Feedback Extraction
(Karen, Marinho, and Schmidt-Thieme 2008) proposed to Profile
extend User × Item rating matrix with user tags as items Updating
User’s Profile
and item tags as users. Szomszor and al. (Szomszor et al.
2007) proposed the use of collaborative tagging, also Collaborative Social Semantic
known as folksomies, to enrich users’ profiles. Thus, each View View View
user has a tag cloud, as well as items. User’s predicted
interest on each tagged item can be made based on the Vote Concept
Social
semantic similarity between items’ tags and user’s tag- Matching Matching Matching
clouds.
The huge popularity of online social communities, such
as Facebook (175 million registration), MySpace (110
million registration) has encouraged the use of user’s social
and personal data in recommendation process, especially in
Recommendation Recommendations
taste related domains (movies, music, ).
Recommendations
The first idea about the way to introduce social networks Re-Ranking
in recommender system was to replace the similarity based
neighborhood formation by social neighborhood (friends
and friends of friends). (Sinha and Swiringen 2001)
compared collaborative recommendations made by user’s
friends and those predicted by the system. The results Figure 1: Multi view recommendation engine
showed that users prefer friends’ recommendations. This
can be explained by the fact that users trust their friends’
choices.
User’s Representation
(Groh and Ehmig 2007) conducted an empirical study to Among the user’s needs, user’s profile is represented by
compare collaborative and social recommendations. The three dimensions or views.
experiments have shown that social recommenders perform The collaborative view contains user’s explicit or
as good as the best collaborative filtering systems when implicit ratings.
data is sparse. Similarly, (Golbeck and Ziegler 2006) The socio-demographic view contains user’s social data
developed a social network website, FilmTrust, where like age, gender, profession, location, personal and
professional home pages, and friends’ contact lists.
users manage their FOAF (Friend Of A Friend
The semantic view represents user’s interests in terms of
Vocabulary) based profiles and used TidalTrust algorithm
a weighted concepts vector based on a hierarchical items’
(Golbeck 2005) to infer trust values over the social classification
network. The experimental results have shown that there is
a strong correlation between trust relationships and profile
similarities.
Neighborhoods Generation
(Massa and Avessani 2004) presented a trust-aware Each of the three views, proposed above, will be used by a
recommender system named «Web of Trust» where users recommendation engine to affiliate the user into a specific
define a number of users they trust. This model uses the neighborhood and thus generate recommendations.
Collaborative Neighborhood. The collaborative view 4. c is a new concept, and is neither a super class nor a sub
contains user’s explicit or implicit ratings. Pearson class concept of a concept in Cu;
Correlation can be used to compute users’ similarities and We propose the following algorithm (Algorithm 1.) for
k nearest neighbors’ algorithm to determine such semantic user’s profile updating. It is executed for each
neighborhood in a classic way. new rating r:
Social Neighborhood. Social recommendations are based Algorithm1: Profile Updating
on user’s social community. It contains user’s friends with Begin
trust values expressing how much the active user trusts his
friends. The user annotates his relationships with such
d d
{ d d d d
Input C d = ( c 1 , w1 ), ( c 2 , w 2 ),...( c n , w n ) /* item’s d vector */ }
{ }
information. Trust can be binary (trust or don’t trust) or on
some scale, 1-5 scale where 1 is low trust and 5 is high C u = ( c1u , w1u ), ( c 2u , w 2u ),...( c m
u , w u ) /* User’s u vector */
m
trust. Based on these trust values, user’s social v = r /* user u rating on item d */
neighborhood can be inferred over the social network. For u, d
d d
example, Tidal Trust algorithm can be used (Golbeck Foreach c i ∈ C d wi ≥ min wd Do
2006). Switch ci
d
:
d
Semantic Neighborhood. Semantic view represents user’s d
ci ∈ C u: /* c i already exists in C u */
interests about items’ content. For this, items’ semantic
∑ w j vuj + wid ∗ r d
content representation is needed. wui =
j
Maxv
/* weight’s
ci updating*/
Our choice was pointed on the use of a hierarchic ∑ w j + wid
semantic items’ classification combined with user’s j
evaluations to generate such view. The motivation behind d
∃ c uj ∈ C u c id ∈ S ( c uj ) /* c isuper class concept of a concept in Cu*/
this choice is the availability of such meta-information,
like those of internet and e-commerce portals (Yahoo, {
C = c ' c '∈ C u & c uj ∈ S ( c ' ) }
Open Directory, LookSmart, Amazon, etc), where items
Foreach c' ∈ C Do
are gathered into topics, which are themselves organized
into a hierarchy going from the most general to the most ∑ w j vuj + wid ∗ sim(c' , cid ) ∗ r
j
specific. wc ' = Max v
∑ w j + wid ∗ sim(c' , cid )
We assume the existence of such classification H, where End j
every item d is represented by a weighted concept vector d
Cd : ∃c '∈ C u c' ∈ S (c id ) : /* c i a sub class concept of a concept c’ in Cu*/
{
C d = ( c 1d , w1d ), ( c 2d , w 2d ),...( c nd , w nd ) } wc ' =
∑ w j vuj + wid ∗ sim(c' , cid ) ∗ r
j
Maxv
The semantic view is a key element in our proposal; it is ∑ w j + wid ∗ sim(c' , c id )
j
represented by weighted concepts vector Cu. These wd ∗ r d
concepts are extracted from items’ description Cd which Cu = Cu ∪ (cid , i ) /* adding cto
i
Cu */
Maxv
the user has already rated. d
Else : /*ci is a new concept */
{
C u = ( c1u , w1u ), ( c 2u , w 2u ),...( c m
u ,wu )
m } wd ∗ r
Cu = Cu ∪ (cid , i
)
Maxv
d
/* adding ci to Cu */
Concept’s weight represents its interest score for the End
user. We propose the use of the weighted average to
End
compute the concept’s average rating expressing how
End.
much the user is interested in this concept; the result is
In order to generate recommendations based on semantic
divided by the maximum rating value Maxv (5 for example)
view of the user’s profile, users with similar interests must
to have a value between [0,1]
be found to build semantic neighborhood.
Hierarchical concepts organization allows us to reach
∑ j w j ru , j
w(c )= Max r users with similar concepts and those having more specific
∑ jwj concepts in their semantic views. For example, in a
hierarchic film classification, if we know that a user u likes
User’s vector Cu is updated when the active user rates "comedy" films in general, he should have concept
a new item d. Hence, for each concept c contained in the "comedy" with a high interest weight, "0.9" for example, in
new item’s vector, there are four possible situations: his semantic view and there are other users which like
1. c already exists in Cu; more specific comedy kind films such as "dark comedy" or
2. c is a super class concept of a concept in Cu; "fantasy comedy", these users should belong to the active
3. c is a sub class concept of a concept in Cu; user’s neighborhood with a certain membership degree.
(Algorithm 2.) builds such neighborhood ;
Algorithm2 : User Concept Matching Algorithm4 : Prediction
Begin Begin
{
Input C u = ( c1u , w1u ), ( c 2u , w 2u ),...( c m
u , w u ) /* User’s u vector */
m } Foreach ciu ∈ Cu Do
While Priority_List_c ui .count > 0 Do
Foreach ciu ∈ C u wiu ≥ min wu Do n
pu , j = k ∑ sim (u , ui ) vi , j
Vinit = {
u j ciu ∈ C uj }∪ {u c ∈ C & c ∈ subconcept
j s uj s s ( c iu ) } with
i =1
k= n
1
Foreach u ∈V Do ∑ sim (u , ui )
j init i =1
1
sim ( ciu , c uj ) and sim(u, ui ) = ∑ wiu deg reei (u , ui )
deg ree ( u j ) = ∑ wiu i <=m
w iu − wuj + 1 End i <= m
Priority_List_c ui .add(uj,degree(uj)) End
End End.
{
Vu = Ui =1..m Pr ipority _ List _ ciu } For this we introduce a confidence value per concept
End and per recommendation engine. This value represents how
End. much a user likes items from a specific recommendation
The membership degree formula is proportional to the engine which are classified under this concept. The
similarity between the two users’ concepts and inversely intuition behind this proposition is that a specific user u
proportional to the difference between their interest scores. may like friends’ recommendation for “comedy” films and
Thus for each concept with a significant weight semantic recommendations for “documentary” films for
(>=minwu), we look for users having the same concept in example.
their semantic views (Vinit) and users with more specific Hence, for each concept in semantic view, we introduce
concepts, Subconcepts(c) function looks for such users three confidence values denoted as: Fcoll, Fsoc and Fsem for
(Algorithm 3.). collaborative, social and semantic concept confidence. We
compute the percentage of returned items that are relevant
Algorithm3 : SubConcepts (c) for each recommendation engine classified under a concept
Begin c:
d r >= R , c ∈ C , w c >= W
If (depth(c)=depth(H)) then /*c is a leaf concept*/ u,d d
F =
subconcepts(c) = φ d c∈ C , w c >= W
d
Else R is the minimum user’s rating to be considered as
If (depth(c)=depth(H)-1) then /*c is a super class concept of a relevant, 4 for example, and W is the minimum concept’s
leaf concept*/ weight in item d to be considered as significant, 0.7 for
{
subconcepts(c) = c' c' IS - A c & ∃u c'∈ Cu } example.
Else For each concept in the semantic view, the three
{
subconcepts (c) = c' c' IS - A c & ∃u c'∈ Cu } confidence values are maintained. Thus, the concept vector
{
C = c' ' c' ' IS - A c' & c' IS - A c } Cu is completed as follows:
While ( subconcepts(c) = φ) Do
{
Cu = ( c u , w u , p ,p ,p
1 1 coll 1 soc1 sem1
),...( c u , wu , p ,p ,p
m m collm socm semm
) }
subconcept s ( c ) = U subconcept s ( c ' )
c '∈C For new concepts, the three confidence values are
{
C = c ' ' c'' IS - A c' & c'∈ C } initialized as Fcoll= Fsoc =Fsem=1/3.
End
End
Mixed Hybridization. Perhaps, the first idea that comes to
mind is to simply mix recommendations from the three
End
recommendation engines. If an item is recommended from
End.
more than one engine, the final rating is calculated as the
Once semantic neighborhood built, remains rating average between each engine’s rating. The following linear
predictions on items (Algorithm 4.). combination computes such average:
ru , d = α .rcoll + β .rsoc + δ .rsem
Recommendations’ Re-Ranking
Since each collaborative, social and semantic With: α = β = δ = 1 n if d is recommended by n
recommendation engines produce their own list of recommendation engines (n<=3). If a recommendation
recommendations, recommendations’ re-raking is required. engine doesn’t recommend d, its corresponding rating r
The question here is “which hybridization strategy to will be 0.
adopt?” Burke (Burke 2005) experimented five Weighted Hybridization. Unlike the first hybridization
hybridization strategies: weighted, switching, cascade, strategy,α, β and δ values are proportional to the
feature combination and feature augmentation hybrids. In confidence values of recommended item’s concepts.
this paper, we propose three possible hybridization Hence, α parameter is computed as the weighted average
strategies: mixed, weighted and switched. of item’s collaborative confidence values, as well as β and
δ. We propose the following algorithm to be applied to With depthi is node’s i depth in Amazon’s classification,
each resulting item (Algorithm 5.). Maxdepth is the depth of the most specific node of the
Algorithm5: Weighted Hybridization current item, N is number of items classified under the root
node “books”, ni is number of items classified under node i
{ }
Begin
and finally, Maxdepth is used to normalize all resulting
Input C = ( c d , w d ), ( c d , w d ),...( c d , w d ) /* item’s d vector*/ weights values for the current item. We also used Lin
d 1 1 2 2 n n
{
Cu = (c u , wu , p ,p ,p
1 1 coll1 soc1 sem1
),...(c u , wu , p ,p ,p
m m collm socm semm
) } semantic similarity for this evaluation.
Our evaluation methodology was as follows. User’s
/* user’s u vector*/ collaborative, social and semantic views are built.
Collaborative view contains user’s ratings. Since, user’s
∑ w dj pcollj ∑ w dj p socj ∑ w dj p semj
j j j friends’ list data is not available; we have simulated such
α= β= δ=
∑ w dj ∑ w dj ∑ w dj neighborhood by considering users living in the same
j j j location and having similar ages. For the semantic views,
/* pcollj = psocj = psemj = 1 3 if c dj ∉ Cu */
we have generated different user’s semantic views
depending on ratings number considered; seven
/* Normalization*/ collaborative and semantic views are constructed for each
α β δ user for 1, 5, 10, 20, 30, 40, 50 ratings considered. The
α= ; β= ; δ=
α + β +δ α + β +δ α + β +δ social view remains the same since it does not depend on
user’s ratings.
End.
We have varied the number of ratings considered for the
Switched Hybridization. In this strategy, if an item is recommendation generation and then measured
recommended from more than one recommendation recommendation accuracy using MAE measure and
engine, we chose the rating provided by the engine coverage using RECALL measure, applied on each
corresponding to the maximum value of item’s global recommendation engine separately and also with mixed
confidence values α, β or δ . hybridization strategy .
For each recommendation list, we have calculated the
average of MAE and Recall values for Top5, Top10,
Experimental Evaluation Top20, Top30, Top40 and Top50 items. Figure 2 displays
our results.
In order to experiment our multi view recommender
Preliminary results show that in term of precision,
system, we use BookCrossing dataset1. This dataset
semantic recommendation engine produce more accurate
contains 42643 implicit ratings provided by 10000 users on
recommendations comparing it to collaborative engine,
21944 books, which gives an average of 4.26 rating per
especially with small nucmber of ratings (<10) however in
user. These ratings were collected from All Consuming2
terms of recall, collaborative engine recommends more
website where people can share their interests about books,
relevant items. Semantic engine bad recall may in part be
movies, food and other items. However, user’s friends’ list
explained by the fact that SubConcept function was limited
is not available, only user’s age and location are available.
at one level, i.e. we have only considered direct subclasses
Amazon uses a hierarchy of nodes, called Browse
in user’s neighborhood generation.
Nodes, to organize its items for sale. Each node represents
Mixed hybridization strategy appears to compromise
a collection of items, such as “Harry Potter books”, not the
between semantic recommendations good precision and
items themselves. Browse nodes are related in a
collaborative recommendations good recall. It outperforms
hierarchical structure.
collaborative engine in terms of recall and keeps in the
Hence, for all rated books in the dataset, we crawled the
same time a good accuracy comparable to the semantic
Amazon web service for 15 days to get each book’s nodes,
recommendation engine (Figure 3.).
the result was 309205 nodes including 6176 distinct node
which gives an average of 14 nodes per book.
However, Amazon does not provide nodes’ weights, for
this and in order to favor most specific nodes and at the
same time to diminish the weight of nodes that occur very
frequently, we have estimated node’s i weight as follows:
depthi N
( * log( ))
Max depth ni
Weight (i ) =
Max weight
Figure 3: Comparison between collaborative, social,
semantic and mixed recommendation engines
1
http://www.informatik.uni-freiburg.de/%18cziegler
2
http://www.allconsuming.net/
Figure 2: TopN MAE and Recall for collaborative, semantic and multi view recommendation engines
Golbeck, J. 2005. Computing and Applying Trust in Web-
based Social Networks. Ph.D. thesis. University of
Conclusion Maryland. College Park, MD USA..
In this paper, we have proposed a multi view Golbeck, J., Ziegler, C.N. 2006. Generating Predictive
recommendation engine which exploits semantic web Movie Recommendations from Trust in Social Networks, In
technologies such as semantic items’ description and social Proc. of the fourth international conference on trust
networks beyond the classic ratings data. The results of our management.
experimentations were very promising and improved the Groh, G., and Ehmig, C. 2007. Recommendations in taste
recommendation process in many ways: related domains: Collaborative Filtering vs. Social
1. Exploiting semantic background knowledge enriches Filtering, ACM GROUP’07.
description of different system elements (users, items);
Karen, H.L.; Marinho, L.B.; and Schmidt-Thieme L. 2008.
2. Enhanced semantic description improves items’
classification and users’ clustering, it helps the system to Tagaware Recommender Systems by Fusion of
Collaborative Filtering Algorithms, ACM SAC’08.
produce more accurate predictions;
We believe that the introduction of a semantic level in Lops, P.; Degemmis M. ; and Semeraro, G. 2007.
recommender systems explains users’ judgments in a Improving social filtering techniques through WordNet-
semantic way and should lead to a greater understanding of Based user profiles. UM 2007.
the target users. Massa, P., and Avesani P. 2004. Trust-aware
Social elements are particularly benefit in taste related Collaborative Filtering for Recommender System. In “On
domains. Our multi view recommendation system could the Move to Meaningful Internet Systems: CoopIS, DOA,
make semantic enhanced predictions for an item’s category and ODBASE”. Berlin, Heidelberg: Springer, pp. 3-17.
(scientific papers for example) and social enhanced Middleton, S.E.; Shadbolt, N.R.; and De Roure, D.C. 2004.
recommendations for another item’s category (music, Ontological User Profiling in Recommender Systems.
movies) if the user prefers that. Thus, experimenting this ACM Trans. Information Systems, vol. 22, no. 1, pp. 54-
proposition in an online study will be interesting; it 88.
constitutes one possible outlook to investigate.
The use of interesting Web services which provide Mobasher, B.; Jin, X.; and Zhou, Y. 2004. Semantically
social data about users based on unified user’s models enhanced collaborative filtering on the Web. Book chapter.
(FOAF, APML for example) is also another interesting Web Mining: FromWeb to SemanticWeb.
issue to investigate. Social communities may increase trust Resnick, P.; Iacovou, N.; Suchak, M.; Bergstrom; P.; and
over recommender systems and encourage users to Riedl, J. 1994. GroupLens: An open architecture for
communicate with like-minded people. Thus, this collaborative filtering of netnews. In Proc. of the 1994
consistent users’ participation provides more information Conference on Computer Supported Collaborative Work,
about their interests and preferences; Eds. ACM Press, New York. 175-186.
Sinha, R., and Swearingen, K. 2001. Comparing
recommendations made by online systems and friends.
References DELOS-NSF Workshop on Personalization and
Burke, R. 2005. Hybrid Systems for Personalized Recommender Systems in Digital Libraries.
Recommendations. Book chapter: Intelligent Techniques Szomszor, M.; Cattuto, C.; Alani, H.; O’Hara, K.;
for Web Personalization.133-152, Springer. Baldassarri, A.; Loreto, V.; and Servedio, V.D.P. 2007.
Foltz, P.W. , and Dumais S.T.1992. Personalized Folksonomies, the Semantic Web, and Movie
Information Delivery : An Analysis of Information Recommendation, In Proc. Of the ESWC’07.
Filtering Methods. Communications of the ACM 35 (12), Wang, R.Q., and Kong F.S. 2007. Semantic-Enhanced
51-60. Personalized Recommender System. In Proc. of the
international conference on machine learning and
cybernetics.