=Paper=
{{Paper
|id=Vol-1245/paper3
|storemode=property
|title=Exploiting FrameNet for Content-Based Book Recommendation
|pdfUrl=https://ceur-ws.org/Vol-1245/cbrecsys2014-paper03.pdf
|volume=Vol-1245
|dblpUrl=https://dblp.org/rec/conf/recsys/ClercqSPH14
}}
==Exploiting FrameNet for Content-Based Book Recommendation==
Exploiting FrameNet for Content-Based Book
Recommendation
Orphée De Clercq Michael Schuhmacher Simone Paolo Ponzetto
LT3, Language and Research Group Data and Research Group Data and
Translation Technology Team Web Science Web Science
Ghent University University of Mannheim University of Mannheim
orphee.declercq@ugent.be michael@informatik.uni- simone@informatik.uni-
mannheim.de mannheim.de
Véronique Hoste
LT3, Language and
Translation Technology Team
Ghent University
veronique.hoste@ugent.be
ABSTRACT Keywords
Adding semantic knowledge to a content-based recommender Content-Based Recommender Systems, Semantic Frame, Linked
helps to better understand the items and user representa- Data
tions. Most recent research has focused on examining the
added value of adding semantic features based on structured 1. INTRODUCTION
web data, in particular Linked Open Data (LOD). In this
paper, we focus in contrast on semantic feature construc- Recommender systems are omnipresent online and consti-
tion from text, by incorporating features based on semantic tute a significant part of the marketing strategy of various
frames into a book recommendation classifier. To this pur- companies. In recent years, a lot of advances have been made
pose we leverage the semantic frames based on parsing the in constructing collaborative filtering systems, whereas the
plots of the items under consideration with a state-of-the- research on content-based recommenders had lagged some-
art semantic parser. By investigating this type of seman- what behind. Similar to evolutions in information retrieval
tic information, we show that these frames are also able to research, the focus has been more on optimizing tools and
represent information about a particular book, but without finding more sophisticated techniques leveraging for exam-
the need of having explicitly structured data describing the ple big data than on the actual understanding or processing
books available. We reveal that exploiting frame informa- of the items or text at hand.
tion outperforms a basic bag-of-words approach and that In Natural Language Processing (NLP), on the other hand,
especially the words relating to those frames are beneficial huge advances have been made in processing text both from
for classification. In a final step we compare and combine a lexical and semantic perspective. In this respect, we be-
our system with the LOD features from a system leverag- lieve it is important to test whether a content-based recom-
ing DBpedia as knowledge resource. We show that both mender system might actually benefit from plugging in more
approaches yield similar results and reveal that combining semantically enriched text features, which is the purpose of
semantic information from these two di↵erent sources might the current research. In this paper we wish to investigate
even be beneficial. to what extent leveraging semantic frame information can
help in recommending books to users. We chose to work
with books, since these typically contain a chronological de-
scription of certain actions or events which might be in-
Categories and Subject Descriptors dicative for the interests of a particular reader. Someone
might enjoy reading historical novels, for example, but is
H.3 [Information Storage and Retrieval]: Content Anal-
more prone to those novels where a love history is explained
ysis and Indexing; H.4 [Information Systems Applica-
in closer detail than those where a typical revenge story is
tions]: Miscellaneous
portrayed. We hypothesize that the semantic frames and or
events in these two types of historical novels will be di↵erent.
In other words, we wish to investigate to what extent deep
semantic parsing of the plots describing a book following the
Permission to make digital or hard copies of all or part of this work for FrameNet paradigm can help for recommendation.
personal or classroom use is granted without fee provided that copies are In order to validate these claims we performed an exten-
not made or distributed for profit or commercial advantage and that copies sive analysis on a book recommendation dataset which was
bear this notice and the full citation on the first page. To copy otherwise, to provided in the framework of the 2014 ESWC challenge.
Copyright 2014 for the individual papers by the paper’s authors.
republish, to post on servers or to redistribute to lists, requires prior specific What is particularly interesting about this dataset is that
Copying permitted
permission for private and academic purposes. This volume is
and/or a fee.
published 2014,
CBRecSys and copyrighted by its Silicon
October 6, 2014, editors.Valley, CA, USA. all the books have been mapped to their corresponding DB-
Copyright
CBRecSys20142014,byOctober
the author(s).
6, 2014, Silicon Valley, CA, USA. pedia URIs which allows us to directly compare externally
14
gained semantic information as available in the Linked Open
Data cloud (LOD) with internal semantic information based Table 1: Example of a frame
on the plots themselves. Our analysis reveals that although Frame: KILLING
some frames and events are good indicators of genres derived The KILLER or CAUSE causes the death of the VICTIM.
from external DBpedia information, they do represent some KILLER John drawned Martha.
additional information which might help the recommenda- VICTIM I saw heretics beheaded.
CAUSE The rockslide killed nearly half of
FEs
tion process.
To actually verify this finding we test the added value the climbers.
of incorporating frame information as semantic features in INSTRUMENT It’s difficult to suicide with only
a basic recommender system. We see that exploiting this a pocketknife.
kind of semantic information outperforms a standard bag- ..., kill.v, killer.n, killing.n, lethal.a, liquidate.v, liqui-
LUs
of-words unigram baseline and that incorporating frame el- dation.n, liquidator.n, lynch.v, massacre.n,massacre.v,
ements and lexical units evoking the frames allows for the matricide.n, murder.n, murder.v, murderer.n,...
best overall performance. If we compare our best system
to a system levering semantic LOD information, we observe
that our frames approach is not able to outperform this sys- base and LinkedMDB as the only background knowledge for a
tem. We do find, however, that if we combine these two movie recommender system and show that thanks to this on-
semantic information sources into one system we get the tological information the quality of a standard content-based
best overall performance. This might indicate that combin- system can be improved. In more recent work, the seman-
ing semantic information from di↵erent sources, i.e. from tic item descriptions based on LOD have been merged with
the linguistically grounded implicit frame features and the positive implicit feedback in a graph-based representation
explicit, ontology grounded DBpedia features, is beneficial. to produce a hybrid top-N item recommendation algorithm,
The remainder of this paper is structured as follows. In SPrank [17], which further underlines the added value of this
Section 2 we describe some related work with an explicit kind of data. Moreover, in 2014 in order to spark research
focus on the added value of semantic information for recom- on LOD and content-based recommender systems, a shared
mender systems. In Section 3 we then explain in closer detail task was organized by the same authors, i.e. the ESWC-14
the construction and reasoning behind the semantic frame- Challenge1 .
enhancement. We then continue by describing the actual In content-based recommendation, the advances that have
experimental setup (Section 4) and have a closer analysis been made were made possible thanks to the availability of
of the results (Section 5). We finish with some concluding designated datasets. These include data for predicting mu-
remarks and ideas for future work (Section 6). sic, Last.FM2 , and or movies, MovieLens3 . Up till now little
research has been performed on other genres, such as books.
2. RELATED WORK The ESWC challenge, however, made a book recommenda-
tion dataset available which is mapped to DBpedia. DB-
In content-based recommender systems, the items to be
pedia is a crowd-sourced community e↵ort to extract struc-
recommended are represented by a set of features based on
tured information from Wikipedia and makes them available
their content, whereas a user is represented by his profile.
as linked RDF data [14]. This dataset will be used as our
To build a recommender both information sources are com-
main data source. In this paper, we focus on the feature
pared. Most content-based recommenders use quite simple
construction for a classifier in that we also incorporate se-
retrieval models, such as keyword matching or the vector
mantic features based on the semantic frames present within
space model with basic TF-IDF weighting [15]. A prob-
the items to be recommended. This is, to our knowledge,
lem with these models is that they tend to ignore semantic
the first approach that tries to leverage this kind of data and
information. To overcome this one can use Explicit Seman-
is one way of tackling the issue of Limited Content Analysis
tic Analysis (ESA) [10] instead of TF-IDF weighting which
within recommender systems [4]. In order to validate these
allows to represent a document as a weighted vector of con-
claims we will compare and combine our best system with a
cepts. Another way to add more linguistic knowledge is to
system exploiting LOD.
use for example information from Wordnet as done by [6, 3].
An alternative is to use language models to represent doc-
uments. This was done for example by [16] when exploring 3. FRAME-ENHANCEMENT
content-based filtering of calls for papers. Besides retrieval In this section we give some more information about why
models, machine learning techniques where a system learns we believe exploiting frame information might help with rec-
the user profile and classifies items as interesting or not are ommendations. First, we introduce some basic concepts and
also used for content-based recommenders. One of the first theory after which we explain how we apply a state-of-the-
to do this was [2] using a Naı̈ve Bayes classifier. art semantic frame parser to our dataset and provide a first
When it comes to adding semantic information to rec- analysis. We hypothesize that a plot description tells more
ommender systems we see that currently leveraging Linked about a book than using more global semantic classification
Open Data (LOD) is a popular research strand. [11] and based on external semantic information as provided by the
[18] were among the first to use LOD for recommendation. LOD cloud. This reasoning can be transferred to other data
The former use this information to build open recommender sources having a large number of textual information.
systems whereas the latter built a music recommender using
collaborative filtering techniques. [4] was the first to really 1
http://challenges.2014.eswc-conferences.org/
leverage LOD to build a content-based recommender and index.php/RecSys
2
the first to exploit the semantics of the relations in the link http://labrosa.ee.columbia.edu/millionsong/
3
hierarchy. They use LOD information from DBpedia, Free- http://grouplens.org/datasets/movielens/
14
Table 2: Example of two sentences of a plot descrip-
tion and its resulting frames.
The [Prince], the protagonist, is [named] Alexander.
His [father], [Prince] Baudouin, is [murdered] by
PLOT
the [King] of Cornwall, [King] [March]. [When]
Alexander [comes] of [age], he [sets out] to Camelot
to [seek] justice from [King] Arthur and to [avenge]
the [death] of his [father]....
Leadership, Appointing, Kinship, Leadership, Killing,
FRAMES
Figure 1: Example of Inheritance relations related Leadership, Leadership, Calendric unit,
to the KILLING frame. Temporal collocation,Arriving, Calendric unit,
Departing,Seeking to achieve, Leadership, Revenge,
Death, Kinship.
3.1 Frame semantics and FrameNet
Following the basic assumption that the meanings of most
words can best be understood on the basis of a seman- elaborated version of the LibraryThing dataset6 . This dataset
tic frame, FrameNet [9] was developed as a linguistic re- contains books that are part of a particular user’s online
source storing considerable information about lexical and catalog containing the books he/she has read or owns. For
predicate-arguments semantics in English. the challenge, the books available in the dataset have been
FrameNet is grounded in the theory of frame semantics [7, mapped to their corresponding DBpedia URIs [17]. Based
8]. This theory tries to describe the meaning of a sentence on the available information we were able to download the
by characterizing the background knowledge required to un- plot description of each book from its corresponding Wikipedia
derstand this sentence. This knowledge is presented in an page (this plot information is lacking in DBpedia). In this
idealized, i.e. prototypical, form. A frame is thus a struc- way we envisaged to investigate whether knowing more about
tured representation of a concept. It can be a description what is actually happening in a book can enhance the rec-
of a type of event, relation or entity, and the participants ommendation. We worked with a subset by only including
in it. In Table 1 we present an example of such a frame, books of which a uniform and unambiguous DBpedia link
KILLING. We see it is a semantic class containing various was available and that actually contained plot information
predicates, also known as lexical units (LUs), evoking the on Wikipedia. In total our final dataset contains 5,063 books
described situation, e.g. killer, murder, lethal. Moreover, with an average plot length of 312 words7 .
it illustrates that within FrameNet each frame comes with In order to annotate the semantic frames, each plot was
a set of semantic roles, i.e. frame elements (FEs), which parsed using the state-of-the-art frame-semantic parser SE-
can be perceived as the participants and/or properties of a MAFOR [5]. This parser extracts semantic predicate-
frame which are of course also lexicalized in the text itself, argument structures from text using a statistical model and
e.g. Killer: John, Instrument: with only a pocketknife. is trained on the FrameNet 1.5 release. It takes as input the
FrameNet’s latest release (1.5) contains 877 frames and text as such, performs some preprocessing steps and outputs
about 155K exemplar sentences.4 An interesting aspect of on a sentence-per-sentence basis all frames that are present
the FrameNet lexicon is that asymmetric frame relations can within a text. These frames are represented by one of the 877
relate two frames, thus forming a complex hierarchy contain- possible frame names and also the lexical units and frame
ing both is-a like and non-hierarchical relations [22]. In this elements (both generic and lexicalized form) are output. An
work, we are particularly interested in the former type, also example is presented in Table 2. This is the plot description
known as Inheritance relations. This type of relation entails of the book The Prince and the Pilgrim. In the text itself,
that the child frame is a subtype of the parent frame. If we the lexical units evoking the frames are indicated in square
look for instance at our Killing example, of which the taxon- brackets. The frames and LUs which are represented in bold
omy is visualized in Figure 15 , we are able to find out that are those frames which actually constitute an Event. Find-
this frame is a child of the frame Transitive action, which is ing out which books are events can be done by exploiting
in turn a child of both the frame Objective Influence and, the taxonomy (cfr. supra) which enables us in a way to find
more interestingly, the frame Event. This taxonomy thus out more semantic properties of specific frames. Intuitively,
enables us to find even more semantic properties about spe- we can state that especially those Event frames give most
cific frames. information about what is happening within a book: the
above-mentioned book is clearly a revenge story. However,
3.2 Exploiting FrameNet the other frames might also pinpoint important aspects, e.g.
the repetition of the Leadership and Kinship frames could
3.2.1 Book dataset inform us that this novel is about royalty and family.
For the research described in this paper, we worked with What this example also illustrates is that the SEMAFOR
the dataset of the ESWC challenge which is in fact a re- parser is not 100% accurate. For example, the name of a
particular king – King March – is interpreted by the parser
4 as evoking the frame Calendric unit. We should thus keep
This release is available at http://framenet.icsi.
berkeley.edu
5 6
This graph was produced using the FrameGrapher http://www.macle.nl/tud/LT/
7
tool, https://framenet.icsi.berkeley.edu/fndrupal/ This dataset will also be made available to the research
FrameGrapher community in due time
15
in mind that a certain amount of noise is also introduced frames based on our manual analysis (dark grey). If we go
into our dataset. Moreover, some frames such as Arriving to the level of the Events, we see that this already allows for
or Temporal collocation, are correctly labeled but do not finding more unique events per genre. Again, the Science
really contribute interesting semantic information. Fiction and Crime genre are best represented. When we
For all books in our dataset we parsed the plots using had a closer look at other discriminating features we found
SEMAFOR, after which we also filtered out those frames the same tendency. In the Crime genre, for example, other
which can have the Event frame as a parent. Some data Events such as Verdict, Revenge, Execution, Robbery all
statistics regarding these annotations are presented in Ta- appeared within the top twenty features.
ble 3, which reveal that the information we have available is From this analysis we could deduce that both the frames
rather skewed. and events might deliver the same type of information as the
LOD, with the events being more representative. However,
what becomes clear is that the frames also contribute more
Table 3: Plot annotation statistics representing information. They can represent what is happening within
the average number of real and unique frames and a book. If we again consider our running example (cfr. Ta-
events per book and their standard deviations ble 2), which is classified as Fantasy, we feel that enriching
a recommender with semantic frame, and especially with
# Avg Stdev # Avg unique Stdev event information, might account for a better recommenda-
Frames 197 205 96 61 tion. This brings us to the actual experiments.
Events 42 45 22 15
4. EXPERIMENTS
3.2.2 Semantic frames versus Linked Open Data For our experiments we focus on the generation of new,
As previously mentioned, we hypothesize that using frames semantic features. In our experimental setting we aim to
might represent di↵erent information than using semantic evaluate the contribution of those features and thus do not
information represented in the LOD-cloud. The books dataset explicitly focus on engineering towards a top recommenda-
we have at hand is particularly useful to verify this claim tion performance.
since all books have been mapped to their DBpedia URIs.
In order to do so, we relied on a manual subdivision of 4.1 Experimental Set-Up and Evaluation
all books in genres based on LOD. This classification was We opt to add our semantic features to an existing recom-
made by [23] by parsing the abstract (dbo: abstract), mender system [23], which participated, and performed well,
the genre ( dbo:literaryGenre, dbp:genre) and the subject in the ESWC’14 Challenge. Though we do apply feature
(dcterms: subject) of each book against a regular expres- weighting and feature selection as described below, the over-
sion pattern of thirty distinct genres. The authors performed all item classification and collaborative-filtering elements of
this step to allow for more data coverage. However, by doing the base system remain unchanged. This allows us to di-
so they also made a combination of various LOD informa- rectly compare the predictive power of the frame-based fea-
tion categories which enables us to directly compare these tures with the DBpedia-based features used by the original
with our semantic frames. If we have a look at our running system, in particular as both approaches are di↵erent uti-
example, The Prince and the Pilgrim,8 , we notice that this lizations of the same information source, i.e. Wikipedia,
book is classified under the Fantasy genre. and dataset, i.e. the ESWC RecSys Challenge data.
Based on this genre mapping, we calculated the gain ra- We use a reduced version of the dataset, based on a filter-
tio [19] of our semantic frames representation with relation ing of the 5,063 books that were retained as having sufficient
to the genres, thus considering the frames as features allow- plot information available (Section 3.2). This dataset has bi-
ing to do genre classification. These gain ratios can then nary ratings and consists of 53,665 user-item-rating triples
be observed as feature weights, and ranked according to the (6,162 users, 4,251 items) in the training data and 50,654
amount of information they add to discriminating between triples (6,180 users, 4,311 items) in the evaluation dataset.
the thirty possible genres. We start our analysis by first Even though this is a binary classification task, we opt to
only considering the semantic frame annotations. It became output the positive class likelihood and not the final binary
apparent, however, that it might be more interesting to also classification in order to avoid making a decision about the
closer inspect those frames which are Events since these in- cut-o↵ for the likelihood values. Consequently, we evaluate
tuitively better represent what is actually happening. with root-mean-squared error (RMSE) to capture also the
The result of these analyses in presented in Table 4. Be- degree of confidence between the classification and the gold-
cause of space constraints, we only represent the five genres standard test dataset9 . RMSE is calculated as:
representing most books of our dataset. This table each time v
contains the ten top features (frames and events), i.e. those u m
u1 X
with the highest gain ratio. The cell colour represents the RM SE = t (Xi xi )2
m i=1
manual analysis, indicated in light grey are those frames and
events occurring only within one particular genre. In darker
grey the frames and events which are representative for a in which Xi is the prediction and xi the response value,
specific genre are indicated. Regarding the frames, we see i.e. the correct value for the task at hand, and m is the
that it is more difficult to find distinctive features correlating number of items for which a prediction is made. Speaking
with the genre (light grey). In the upper part, only the Sci- in practical terms, the lower the RMSE value the better,
ence Fiction and Crime genre contain truly representative 9
Obtained from the ESWC’14 Challenge Chairs upon re-
8
http://dbpedia.org/resource/The Prince and the Pilgrim quest.
16
Table 4: Top ten features with the highest gain ratios in the five most popular LOD genres. Light grey cells
represent genre-unique and dark grey ones genre-representative features.
Fantasy Science Fiction History Children Crime
Jury deliberation Beyond compare Representing Memorization Extradition
Bond maturation Becoming dry Intentional traversing Measure area Go into shape
Intentional traversing Containment relation Dominate competitor Estimated value Exporting
FRAMES
Cause to rot Dunking Getting vehicle underway Rope manipulation Becoming dry
Get a job Exclude member Cause to rot Degree of processing Arson
Beyond compare Representing Beyond compare Jury deliberation Measure area
Representing Jury deliberation Probability Bond maturation Dominate competitor
Locale by ownership Cause to rot Jury deliberation Intentional traversing Containment relation
Ratification Medium Color qualities Cause to be dry Reading aloud
Commutation Cause change of phase Get a job Drop in on Extreme point
Surrendering possession Change of consistency Eventive a↵ecting Intentionally a↵ect Endangering
Dodging Immobilization Historic event Examination Posing as
Immobilization Execute plan Extradition Absorb heat Experience bodily harm
Renting Cause impact Surrendering possession Cause to experience Enforcing
EVENTS
Reparation Reparation Corroding caused Fighting activity Cause to be wet
Heralding Eventive a↵ecting Dodging Dodging Intentionally a↵ect
Soaking Get a job Clemency Rope manipulation Intercepting
Intentional traversing Cause to be sharp Intentional traversing Intentional traversing Change resistance
Cause to rot Cause to rot Cause to rot Drop in on Go into shape
Get a job Cause change of phase Get a job Cause to be dry Extradition
because the closer the prediction confidence to the actual 2. Frames
gold standard.
In addition, again motivated by wanting to avoid to choose For the frames as such, we decided to include the resulting
a cut-o↵ point for the class assignment, we follow [12] and frame names (e.g. Killing, Kinship, Leadership) as a sepa-
evaluate with a receiver operating characteristic (ROC) curve rate setting. In total this can lead to a maximum of 877 dis-
and also compute the area under the curve (AUC) for it. criminating features, which is a large feature space shrinkage
While in contrast to RMSE, the ROC curve is computed compared to the bag-of-words representation. This is why
only on the relative ordering of the predictions sorted by con- we decided to also take into consideration those particular
fidence values, it o↵ers the advantage of understanding how words evoking the frames, the Lexical Units (e.g. murdered,
a classifier would perform given di↵erent cut-o↵ values. In father, Prince) on the one hand, and the lexical representa-
addition, with ROC we can compare against recommender tions of the Frame Elements – the semantic roles – evoked
systems that output only an (implicit) ranking and no class by this frame on the other hand (e.g. Prince Baudouin, by
confidence values. the King of Cronwall, King March). In a final setting, we
The base system by [23] we extend is a simple content- incrementally combine these various elements of data, thus
based recommender which trains two Naı̈ve Bayes classi- giving more information to our classifier.
fiers10 on book features acquired from DBpedia, one global
3. Events
classifier as background model and one per-user classifier to
capture individual preferences, trained on a user-neighborhood As was illustrated in Section 3.2 the Events occurring
of variable size. In our experiments, we leave this setting un- within a book seem to intuitively represent important in-
changed but only vary the di↵erent features for item repre- formation of what is actually happening. This is why we
sentation. We experimented with five di↵erent feature rep- also decided to perform the same experiments as with the
resentations, which is explained in closer detail in the next frames but, this time only incorporating those frames which
section. have a possible Event parent somewhere in the FrameNet
4.2 Feature Representation hierarchy. Looking only at the Events further reduced our
feature space to a maximum of 234 features. We therefore
1. Baselines also made the same combinations as mentioned above with
all possible LUs and FEs relating only to Events.
First, we established two baselines: the first baseline was
constructed by including the majority class based on the 4. Taxonomy
training data, in our case the majority class is ‘0’. As a sec-
ond baseline we decided to include a bag-of-words approach In order to exploit the hierarchical structure of Frame-
containing token unigrams from all the di↵erent plots. Net even further, we decided to also investigate three other
The next three groups of features all relate to the frame settings. First we explored whether including besides a
representation of the plots based on the SEMAFOR output frame also its direct parent, thus going one level up in the
(cfr. Section 3.2) graph, might help. We did the same in the other direction,
10 by only including the children which are at the bottom of
Even though being a simple approach, in a preliminary our taxonomy (the leafs). Another way of incorporating this
experiment Naı̈ve Bayes outperformed an SVM, motivating
us to not compare di↵erent classifiers but focus on feature graph information was to calculate for each possible frame
selection. In addition, Naı̈ve Bayes was – as expected – pair that was found in a plot its least common subsumer [20]
significantly faster compared to other classifiers. (LCS), i.e. the parent both frames have in common resulting
17
in the shortest path. Since the FrameNet taxonomy as such
is not hypercomplex, i.e. the maximum distance between Table 5: Experimental results on test dataset (N =
two frames is twelve, we decided to filter out those parents 50,654) with classifier trained on di↵erent feature
which are too generic by manually inspecting the LCS.11 types (best results per category in bold).
For the four above-mentioned setups, the same feature Features RMSE AUC
selection methods were employed. Of course in order to al- Baselines Majority voting (0) 0.7705 n/a
low for a good representation, all word-based features (bow, Words as such 0.6145 0.5431
LUs and FEs) were first tokenized, stemmed and filtered on Frames Frames as such 0.6272 0.5377
stop words. For the automatic feature selection, we first use Lexical units (LUs) 0.6266 0.5398
unsupervised feature attribute weighting by computing the Frame elements (FEs) 0.6036 0.5468
Frames + LUs 0.6259 0.5389
standard TF-IDF weights since all our features are in the
Frames + LUs + FEs 0.6036 0.5453
end derived from text (book plots). Events Events as such 0.6132 0.5148
Events + LUs 0.6259 0.5310
TF IDFi = ln(1 + tfi ) ln(N/dfi )
Events + LUs + FEs 0.6237 0.5296
Next, we use attribute selection by computing the gain ratio Taxonomy Frames One up 0.6244 0.5297
with relation to the binary class label in the training data: Frames Bottom 0.6253 0.5370
Frames + LCS 0.6285 0.5376
RG (Attr, Class) = (H(Class) H(Class|Attr))/H(Attr) LOD DBpedia features 0.6022 0.5588
DBpedia + FEs 0.5982 0.5498
This should allow us to filter out noise or unimportant fea- (DBpedia + FEs hybrid) (0.5664) (0.5571)
tures. We keep only those features with a gain ratio larger
than zero (RG > 0).12
Contrary to our expectations, our settings with only frames
5. Linked Open Data (LOD) or events do not outperform this baseline. We do see that
the events as such, which constitute a much smaller feature
In a final setup we compare our best setting with the LOD space, perform slightly better than the frames. The bag-
features used by the base system, i.e. properties and values of-words baseline is only outperformed when using features
from DBpedia, and apply the same feature weighting and actually presenting some sort of word filtering mechanism:
selection process. The features in the base system were man- the Frame Elements are the lexical representation of words
ually selected and contain explicit book attributes, as e.g. which are evoked by certain frames in the form of semantic
dbo:author (db:Umberto_Eco), but also categorical infor- roles. Even though these features are extracted from the
mation as dbo:literaryGenre (db:Historical_novel), dc- text, it performs better than the bag-of-words (Words as
terms:subject (category:Novels_set_in_Italy) or rdf:- such) baseline approach (0.6145) which does not make use
type (yago:PhilosophicalNovels) and untyped Wikipedia of any semantic information. Analyzing the RG -ranked fea-
links in general. ture attributes revealed that also for the other best frame
We use the same set of features as reported by [23], but, approach Frames+LUs+FEs, the dominant attributes are
to remain consistent across all experimental settings, apply the Frame elements, these were ranked highest. What is
our feature selection and weighting approach and use our strange is that we do not find a similar trend when perform-
reduced training and test dataset. In addition, we tested ing the same combination with our Event frames. This is
the combination of the DBpedia features with our best- probably because the feature space is too small to make a
performing frame approach. well-informed decision.
Figure 2 presents the ROC curves for our features, for
5. RESULTS the sake of readability only the most interesting curves are
We report experimental results for the di↵erent feature plotted. As to be expected from the AUC values, all curves
settings in Table 5. Overall, the two best performing frame are very close together. Besides not being far away from the
features are the Frame elements and the Frames+LUs+FEs, diagonal, for no curve a clear cut-o↵ value is recognizable.
both achieve an RMSE of 0.6036. We see that the best We observe that the DBpedia features are slightly better for
result is obtained when making the combination between the left and partially the middle part of the curve, leading to
the Frame elements and the LOD system, RMSE of 0.5982. the interpretation that those features are superior for recom-
Looking at the AUC for the ROC curve, both features still mender systems which focus on quality. Comparing the best
perform very well, but not as good as the DBpedia features frames-based approach (FEs) with the bag-of-words baseline
alone, which achieve the best overall AUC of 0.5588. (Words), we see that FEs are mostly better than just words,
Considering the RMSE values, we observe that the ma- with some exception around a false positive rate of around
jority baseline is easily outperformed by all di↵erent set- 0.23.
tings. Looking at the bag-of-words baseline, however, illus- We also compare our system with the hybrid recommender
trates that having the words of the plot available for rec- system from Ristoski at. al. [21] (AUC 0.5848), which was
ommendation is already a quite difficult to beat baseline. the second best system of the ESWC challenge and per-
11
formed essentially equally well as the winning system. That
We looked at the most frequent LCS nodes and excluded system combined many di↵erent features, not only LOD,
the first 10 generic nodes such as Artifact, Relation, Inten-
tionally a↵ect, Gradable attributes, Transitive action. but also user ratings and explicit collaborative filtering ap-
12
Preliminary experiments revealed that keeping all features proaches.13
as well as doing classifier-based features selection with OneR
13
[13] with 5-fold-cross-validation on the training data con- As that system only outputs scores for the purpose of rank-
stantly underperformed against this setting. ing, we transformed those into confidences by dividing each
18
1.0
Words
FEs
DBpedia
0.8
DBpedia_FEs
Ristoski
True positive rate
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
False positive rate
Figure 2: ROC curve for selected features and the top-performing ESWC system [21]
Looking at the ROC curve, it becomes clear that incorpo- to add semantic information to a content-based book rec-
rating more and diverse features is beneficial in this setting. ommender system. We directly compared the addition of
However, we have to note that this system combines di↵erent text internal semantic frame information with text external
recommenders using the Borda rank aggregation method, ontological information based on Linked Open Data (LOD),
which was not learned on the training data but manually a popular research strand.
selected while having knowledge about the test dataset (see We have shown that parsing the book plots with a state-of-
also the comment below on our own combination model). the-art semantic frame parser, SEMAFOR, delivers valuable
If we compare our best semantic frame results with the additional semantic information. This information could en-
systems leveraging Linked Data, we see that we achieve a able a system to fully grasp what is happening within a book.
better performance (RMSE of 0.6022) when using the DB- One of the added values of FrameNet is that all frames are
pedia features alone and that we get the best overall results related in a taxonomy which allows you to pinpoint those
when combining both our best system with the Linked Data Events forming the key components of a book. Based on a
(RMSE of 0.5982). In this way it appears that combining direct comparison between the frames and events and a list
semantic information from di↵erent sources, i.e. from the of genres derived from DBpedia attributes, we have shown
linguistically grounded frames features and the explicit, on- that although these data sources show some similarities, the
tology grounded DBpedia features, is beneficial in this set- semantic frames should be able to represent more specific
ting. The AUC results, however, do not corroborate this information about what is happening in a particular book.
finding. In order to test this claim in closer detail, we have per-
Last, when not learning LOD+FEs together in one model, formed experiments where the focus was on generating new
but separately and combine results with a simple linear com- semantic features and find out what these can contribute to
bination (these results are presented in brackets in Table 5), a book recommendation system using one global classifier
as also done by [1], with = 0.5, we achieve better results as background model and one per-user classifier. We see
(RMSE of 0.5664 and AUC of 0.5571). However, this no- that exploiting semantic frame information outperforms a
table improvement depends in the end on our knowledge standard bag-of-words unigram baseline and that especially
of the test dataset, as it influenced our choice of a linear incorporating frame elements and lexical units evoking the
combination, instead of learning the combination of the dif- frames allows for the best overall performance. If we com-
ferent classifiers on the training data. Strictly speaking, this pare our best system to a system levering semantic LOD
is thus not a valid experimental result, nevertheless it indi- information, we observe that our frames approach is not
cates there is most likely a better hybrid design with feature able to outperform this system. We do find, however, that
combinations that will better utilize the semantic frame fea- if we combine these two semantic information sources into
tures and should yield better results. one system we get the best overall performance. This might
indicate that combining semantic information from di↵er-
ent sources, i.e. from the linguistically grounded implicit
6. CONCLUSION frame features and the explicit, ontology grounded DBpedia
In this paper we have presented an alternative approach features, is beneficial.
This work has inspired many ideas for future work. Con-
score by a constant.
19
sidering the current setup, we are aware that we completely [8] C. J. Filmore. Frames and the semantics of
relied on the output of one semantic frame parser, i.e. SE- understanding. Quaderni di Semantica, IV(2), 1985.
MAFOR. We believe that using a filtering mechanism be- [9] C. J. Filmore, C. R. Johnson, and M. R. L. Petruck.
forehand, e.g. to filter out those frames and or events which Background to framenet. International Journal of
are less meaningful or noisy, or that by applying a di↵erent Lexicography, 16(3):235–250, 2003.
parser or event extraction techniques new lights can be shed [10] E. Gabrilovich and S. Markovitch. Computing
on the added value of this type of information. Also, since we semantic relatedness using wikipedia-based explicit
now only relied on Wikipedia to extract book information, semantic analysis. In Proceedings of the 20th
we had to reduce an original larger book data. We realize International Joint Conference on Artifical
a lot of additional information about books can be found intelligence, 2007.
online, for example on Google Books, Amazon, GoodReads, [11] B. Heitmann and C. Hayes. Using linked data to build
etcetera. Also the same techniques can be used to extract open, collaborative recommender systems. In AAAI
other types of information from both the items and users Spring Symposium: Linked Data Meets Artificial
under consideration for the recommendation task. Intelligence, 2010.
As mentioned at the end of Section 5 we would like to fur- [12] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and
ther investigate whether another hybrid design might yield J. T. Riedl. Evaluating collaborative filtering
better results. In this respect, it would be interesting to recommender systems. ACM Transactions on
plug our semantic knowledge in a collaborative-filtering ap- Information Systems, 22(1):5–53, 2004.
proach to see whether this can actually help the overall per-
[13] R. C. Holte. Very simple classification rules perform
formance. Using our semantic frames we could also inspect
well on most commonly used datasets. Machine
in closer detail typical problems recommender systems face
Learning, 11(1):63–90, 1993.
such as cold-start and data sparsity.
[14] J. Lehmann, R. Isele, M. Jakob, A. Jentzsch,
D. Kontokostas, P. N. Mendes, S. Hellmann,
Acknowledgments M. Morsey, P. van Kleef, S. Auer, and C. Bizer.
The work presented in this paper has been partly funded by DBpedia – A Large-scale, Multilingual Knowledge
the PARIS project (IWT-SBO-Nr. 110067). Furthermore, Base Extracted from Wikipedia. Semantic Web
Orphée De Clercq was supported by an exchange grant from Journal, 2013.
the German Academic Exchange Service (DAAD STIBET [15] P. Lops, M. Gemmis, and G. Semeraro. Content-based
scholarship program). We would like to thank Christian recommender systems: State of the art and trends. In
Meilicke for his help in providing the manually derived gen- F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor,
res and his help with building the original ESWC recom- editors, Recommender Systems Handbook, pages
mender system. 73–105. Springer US, 2011.
[16] G. Martı́n, S. Schockaert, C. Cornelis, and
H. Naessens. An exploratory study on content-based
7. REFERENCES filtering of call for papers. Multidisciplinary
[1] C. Basu, H. Hirsh, W. Cohen, et al. Recommendation Information Retrieval, Lecture Notes in Computer
as classification: Using social and content-based Science, 8201:58–69, 2013.
information in recommendation. In AAAI’98, pages [17] V. Ostuni, T. Di Noia, E. Di Sciascio, and R. Mirizzi.
714–720, 1998. Top-n recommendations from implicit feedback
[2] D. Billsus and M. J. Pazzani. User modeling for leveraging linked open data. In Proceedings of RecSys
adaptive news access. User Modeling and 2013, 2013.
User-Adapted Interaction, 10(2-3):147–180, 2000. [18] A. Passant. dbrec: music recommendations using
[3] M. Degemmis, P. Lops, and G. Semeraro. A dbpedia. In Proceedings of the 9th International
content-collaborative recommender that exploits Semantic Web Conference (ISWC’10), 2010.
wordnet-based user profiles for neighborhood [19] J. Quinlan. C4.5: Programs for Machine Learning.
formation. User Modeling and User-Adapted Morgan Kaufmann, San Mateo, California, 1993.
Interaction, 17(3):217–255, 2007. [20] P. Resnik. Using information content to evaluate
[4] T. Di Noia, R. Mirizzi, V. Ostuni, D. Romito, and semantic similarity in a taxonomy. In Proceedings of
M. Zanker. Linked open data to support content-based the 14th international joint conference on Artificial
recommender systems. In I-SEMANTICS ’12 intelligence (IJCAI’95), 1995.
Proceedings of the 8th International Conference on [21] P. Ristoski, E. L. Mencia, and H. Paulheim. A Hybrid
Semantic Systems, 2012. Multi-Strategy Recommender System Using Linked
[5] D. Dipanjan, A. F. T. Martins, N. Schneider, and Open Data. In LOD-enabled Recommender Systems
N. A. Smith. Frame-semantic parsing. Computational Challenge (ESWC 2014), 2014.
Linguistics, 40(1):9–56, 2014. [22] J. Ruppenhofer, M. Ellsworth, M. R. L. Petruck, and
[6] M. Eirinaki, M. Vazirgiannis, and I. Varlamis. Sewep: C. R. Johnson. Framenet ii: Extended theory and
using site semantics and a taxonomy to enhance the practice. Technical report, 2005.
web personalization process. In Proceedings of the [23] M. Schuhmacher and C. Meilicke. Popular Books and
ninth ACM SIGKDD international conference on Linked Data: Some Results for the ESWC-14 RecSys
Knowledge discovery and data mining, 2003. Challenge. In LOD-enabled Recommender Systems
[7] C. J. Filmore. Frame semantics. Linguistics in the Challenge (ESWC 2014), 2014.
Morning Calm, pages 111–137, 1982.
20