Exploiting FrameNet for Content-Based Book
                                 Recommendation

                  Orphée De Clercq                              Michael Schuhmacher                          Simone Paolo Ponzetto
                 LT3, Language and                              Research Group Data and                       Research Group Data and
             Translation Technology Team                              Web Science                                   Web Science
                   Ghent University                              University of Mannheim                        University of Mannheim
            orphee.declercq@ugent.be michael@informatik.uni-                                                 simone@informatik.uni-
                                         mannheim.de                                                             mannheim.de
                                        Véronique Hoste
                                                                  LT3, Language and
                                                              Translation Technology Team
                                                                    Ghent University
                                                             veronique.hoste@ugent.be

ABSTRACT                                                                                  Keywords
Adding semantic knowledge to a content-based recommender                                  Content-Based Recommender Systems, Semantic Frame, Linked
helps to better understand the items and user representa-                                 Data
tions. Most recent research has focused on examining the
added value of adding semantic features based on structured                               1. INTRODUCTION
web data, in particular Linked Open Data (LOD). In this
paper, we focus in contrast on semantic feature construc-                                    Recommender systems are omnipresent online and consti-
tion from text, by incorporating features based on semantic                               tute a significant part of the marketing strategy of various
frames into a book recommendation classifier. To this pur-                                companies. In recent years, a lot of advances have been made
pose we leverage the semantic frames based on parsing the                                 in constructing collaborative filtering systems, whereas the
plots of the items under consideration with a state-of-the-                               research on content-based recommenders had lagged some-
art semantic parser. By investigating this type of seman-                                 what behind. Similar to evolutions in information retrieval
tic information, we show that these frames are also able to                               research, the focus has been more on optimizing tools and
represent information about a particular book, but without                                finding more sophisticated techniques leveraging for exam-
the need of having explicitly structured data describing the                              ple big data than on the actual understanding or processing
books available. We reveal that exploiting frame informa-                                 of the items or text at hand.
tion outperforms a basic bag-of-words approach and that                                      In Natural Language Processing (NLP), on the other hand,
especially the words relating to those frames are beneficial                              huge advances have been made in processing text both from
for classification. In a final step we compare and combine                                a lexical and semantic perspective. In this respect, we be-
our system with the LOD features from a system leverag-                                   lieve it is important to test whether a content-based recom-
ing DBpedia as knowledge resource. We show that both                                      mender system might actually benefit from plugging in more
approaches yield similar results and reveal that combining                                semantically enriched text features, which is the purpose of
semantic information from these two di↵erent sources might                                the current research. In this paper we wish to investigate
even be beneficial.                                                                       to what extent leveraging semantic frame information can
                                                                                          help in recommending books to users. We chose to work
                                                                                          with books, since these typically contain a chronological de-
                                                                                          scription of certain actions or events which might be in-
Categories and Subject Descriptors                                                        dicative for the interests of a particular reader. Someone
                                                                                          might enjoy reading historical novels, for example, but is
H.3 [Information Storage and Retrieval]: Content Anal-
                                                                                          more prone to those novels where a love history is explained
ysis and Indexing; H.4 [Information Systems Applica-
                                                                                          in closer detail than those where a typical revenge story is
tions]: Miscellaneous
                                                                                          portrayed. We hypothesize that the semantic frames and or
                                                                                          events in these two types of historical novels will be di↵erent.
                                                                                          In other words, we wish to investigate to what extent deep
                                                                                          semantic parsing of the plots describing a book following the
Permission to make digital or hard copies of all or part of this work for                 FrameNet paradigm can help for recommendation.
personal or classroom use is granted without fee provided that copies are                    In order to validate these claims we performed an exten-
not made or distributed for profit or commercial advantage and that copies                sive analysis on a book recommendation dataset which was
bear this notice and the full citation on the first page. To copy otherwise, to           provided in the framework of the 2014 ESWC challenge.
Copyright 2014 for the individual papers by the paper’s authors.
republish, to post on servers or to redistribute to lists, requires prior specific        What is particularly interesting about this dataset is that
Copying permitted
permission             for private and academic purposes. This volume is
             and/or a fee.
published 2014,
CBRecSys    and copyrighted    by its Silicon
                  October 6, 2014,    editors.Valley, CA, USA.                            all the books have been mapped to their corresponding DB-
Copyright
CBRecSys20142014,byOctober
                     the author(s).
                             6, 2014, Silicon Valley, CA, USA.                            pedia URIs which allows us to directly compare externally


                                                                                     14
gained semantic information as available in the Linked Open
Data cloud (LOD) with internal semantic information based                             Table 1: Example of a frame
on the plots themselves. Our analysis reveals that although                                     Frame: KILLING
some frames and events are good indicators of genres derived            The KILLER or CAUSE causes the death of the VICTIM.
from external DBpedia information, they do represent some                  KILLER                 John drawned Martha.
additional information which might help the recommenda-                    VICTIM                 I saw heretics beheaded.
                                                                           CAUSE                  The rockslide killed nearly half of


                                                                        FEs
tion process.
   To actually verify this finding we test the added value                                        the climbers.
of incorporating frame information as semantic features in                 INSTRUMENT It’s difficult to suicide with only
a basic recommender system. We see that exploiting this                                           a pocketknife.
kind of semantic information outperforms a standard bag-                   ..., kill.v, killer.n, killing.n, lethal.a, liquidate.v, liqui-


                                                                        LUs
of-words unigram baseline and that incorporating frame el-                 dation.n, liquidator.n, lynch.v, massacre.n,massacre.v,
ements and lexical units evoking the frames allows for the                 matricide.n, murder.n, murder.v, murderer.n,...
best overall performance. If we compare our best system
to a system levering semantic LOD information, we observe
that our frames approach is not able to outperform this sys-           base and LinkedMDB as the only background knowledge for a
tem. We do find, however, that if we combine these two                 movie recommender system and show that thanks to this on-
semantic information sources into one system we get the                tological information the quality of a standard content-based
best overall performance. This might indicate that combin-             system can be improved. In more recent work, the seman-
ing semantic information from di↵erent sources, i.e. from              tic item descriptions based on LOD have been merged with
the linguistically grounded implicit frame features and the            positive implicit feedback in a graph-based representation
explicit, ontology grounded DBpedia features, is beneficial.           to produce a hybrid top-N item recommendation algorithm,
   The remainder of this paper is structured as follows. In            SPrank [17], which further underlines the added value of this
Section 2 we describe some related work with an explicit               kind of data. Moreover, in 2014 in order to spark research
focus on the added value of semantic information for recom-            on LOD and content-based recommender systems, a shared
mender systems. In Section 3 we then explain in closer detail          task was organized by the same authors, i.e. the ESWC-14
the construction and reasoning behind the semantic frame-              Challenge1 .
enhancement. We then continue by describing the actual                    In content-based recommendation, the advances that have
experimental setup (Section 4) and have a closer analysis              been made were made possible thanks to the availability of
of the results (Section 5). We finish with some concluding             designated datasets. These include data for predicting mu-
remarks and ideas for future work (Section 6).                         sic, Last.FM2 , and or movies, MovieLens3 . Up till now little
                                                                       research has been performed on other genres, such as books.
2.   RELATED WORK                                                      The ESWC challenge, however, made a book recommenda-
                                                                       tion dataset available which is mapped to DBpedia. DB-
   In content-based recommender systems, the items to be
                                                                       pedia is a crowd-sourced community e↵ort to extract struc-
recommended are represented by a set of features based on
                                                                       tured information from Wikipedia and makes them available
their content, whereas a user is represented by his profile.
                                                                       as linked RDF data [14]. This dataset will be used as our
To build a recommender both information sources are com-
                                                                       main data source. In this paper, we focus on the feature
pared. Most content-based recommenders use quite simple
                                                                       construction for a classifier in that we also incorporate se-
retrieval models, such as keyword matching or the vector
                                                                       mantic features based on the semantic frames present within
space model with basic TF-IDF weighting [15]. A prob-
                                                                       the items to be recommended. This is, to our knowledge,
lem with these models is that they tend to ignore semantic
                                                                       the first approach that tries to leverage this kind of data and
information. To overcome this one can use Explicit Seman-
                                                                       is one way of tackling the issue of Limited Content Analysis
tic Analysis (ESA) [10] instead of TF-IDF weighting which
                                                                       within recommender systems [4]. In order to validate these
allows to represent a document as a weighted vector of con-
                                                                       claims we will compare and combine our best system with a
cepts. Another way to add more linguistic knowledge is to
                                                                       system exploiting LOD.
use for example information from Wordnet as done by [6, 3].
An alternative is to use language models to represent doc-
uments. This was done for example by [16] when exploring               3.     FRAME-ENHANCEMENT
content-based filtering of calls for papers. Besides retrieval           In this section we give some more information about why
models, machine learning techniques where a system learns              we believe exploiting frame information might help with rec-
the user profile and classifies items as interesting or not are        ommendations. First, we introduce some basic concepts and
also used for content-based recommenders. One of the first             theory after which we explain how we apply a state-of-the-
to do this was [2] using a Naı̈ve Bayes classifier.                    art semantic frame parser to our dataset and provide a first
   When it comes to adding semantic information to rec-                analysis. We hypothesize that a plot description tells more
ommender systems we see that currently leveraging Linked               about a book than using more global semantic classification
Open Data (LOD) is a popular research strand. [11] and                 based on external semantic information as provided by the
[18] were among the first to use LOD for recommendation.               LOD cloud. This reasoning can be transferred to other data
The former use this information to build open recommender              sources having a large number of textual information.
systems whereas the latter built a music recommender using
collaborative filtering techniques. [4] was the first to really        1
                                                                         http://challenges.2014.eswc-conferences.org/
leverage LOD to build a content-based recommender and                  index.php/RecSys
                                                                       2
the first to exploit the semantics of the relations in the link          http://labrosa.ee.columbia.edu/millionsong/
                                                                       3
hierarchy. They use LOD information from DBpedia, Free-                  http://grouplens.org/datasets/movielens/


                                                                  14
                                                                       Table 2: Example of two sentences of a plot descrip-
                                                                       tion and its resulting frames.

                                                                                     The [Prince], the protagonist, is [named] Alexander.
                                                                                      His [father], [Prince] Baudouin, is [murdered] by


                                                                           PLOT
                                                                                        the [King] of Cornwall, [King] [March]. [When]
                                                                                     Alexander [comes] of [age], he [sets out] to Camelot
                                                                                     to [seek] justice from [King] Arthur and to [avenge]
                                                                                                  the [death] of his [father]....
                                                                                    Leadership, Appointing, Kinship, Leadership, Killing,


                                                                           FRAMES
Figure 1: Example of Inheritance relations related                                          Leadership, Leadership, Calendric unit,
to the KILLING frame.                                                                   Temporal collocation,Arriving, Calendric unit,
                                                                                     Departing,Seeking to achieve, Leadership, Revenge,
                                                                                                         Death, Kinship.
3.1    Frame semantics and FrameNet
   Following the basic assumption that the meanings of most
words can best be understood on the basis of a seman-                  elaborated version of the LibraryThing dataset6 . This dataset
tic frame, FrameNet [9] was developed as a linguistic re-              contains books that are part of a particular user’s online
source storing considerable information about lexical and              catalog containing the books he/she has read or owns. For
predicate-arguments semantics in English.                              the challenge, the books available in the dataset have been
   FrameNet is grounded in the theory of frame semantics [7,           mapped to their corresponding DBpedia URIs [17]. Based
8]. This theory tries to describe the meaning of a sentence            on the available information we were able to download the
by characterizing the background knowledge required to un-             plot description of each book from its corresponding Wikipedia
derstand this sentence. This knowledge is presented in an              page (this plot information is lacking in DBpedia). In this
idealized, i.e. prototypical, form. A frame is thus a struc-           way we envisaged to investigate whether knowing more about
tured representation of a concept. It can be a description             what is actually happening in a book can enhance the rec-
of a type of event, relation or entity, and the participants           ommendation. We worked with a subset by only including
in it. In Table 1 we present an example of such a frame,               books of which a uniform and unambiguous DBpedia link
KILLING. We see it is a semantic class containing various              was available and that actually contained plot information
predicates, also known as lexical units (LUs), evoking the             on Wikipedia. In total our final dataset contains 5,063 books
described situation, e.g. killer, murder, lethal. Moreover,            with an average plot length of 312 words7 .
it illustrates that within FrameNet each frame comes with                 In order to annotate the semantic frames, each plot was
a set of semantic roles, i.e. frame elements (FEs), which              parsed using the state-of-the-art frame-semantic parser SE-
can be perceived as the participants and/or properties of a            MAFOR [5]. This parser extracts semantic predicate-
frame which are of course also lexicalized in the text itself,         argument structures from text using a statistical model and
e.g. Killer: John, Instrument: with only a pocketknife.                is trained on the FrameNet 1.5 release. It takes as input the
   FrameNet’s latest release (1.5) contains 877 frames and             text as such, performs some preprocessing steps and outputs
about 155K exemplar sentences.4 An interesting aspect of               on a sentence-per-sentence basis all frames that are present
the FrameNet lexicon is that asymmetric frame relations can            within a text. These frames are represented by one of the 877
relate two frames, thus forming a complex hierarchy contain-           possible frame names and also the lexical units and frame
ing both is-a like and non-hierarchical relations [22]. In this        elements (both generic and lexicalized form) are output. An
work, we are particularly interested in the former type, also          example is presented in Table 2. This is the plot description
known as Inheritance relations. This type of relation entails          of the book The Prince and the Pilgrim. In the text itself,
that the child frame is a subtype of the parent frame. If we           the lexical units evoking the frames are indicated in square
look for instance at our Killing example, of which the taxon-          brackets. The frames and LUs which are represented in bold
omy is visualized in Figure 15 , we are able to find out that          are those frames which actually constitute an Event. Find-
this frame is a child of the frame Transitive action, which is         ing out which books are events can be done by exploiting
in turn a child of both the frame Objective Influence and,             the taxonomy (cfr. supra) which enables us in a way to find
more interestingly, the frame Event. This taxonomy thus                out more semantic properties of specific frames. Intuitively,
enables us to find even more semantic properties about spe-            we can state that especially those Event frames give most
cific frames.                                                          information about what is happening within a book: the
                                                                       above-mentioned book is clearly a revenge story. However,
3.2    Exploiting FrameNet                                             the other frames might also pinpoint important aspects, e.g.
                                                                       the repetition of the Leadership and Kinship frames could
3.2.1 Book dataset                                                     inform us that this novel is about royalty and family.
  For the research described in this paper, we worked with                What this example also illustrates is that the SEMAFOR
the dataset of the ESWC challenge which is in fact a re-               parser is not 100% accurate. For example, the name of a
                                                                       particular king – King March – is interpreted by the parser
4                                                                      as evoking the frame Calendric unit. We should thus keep
  This release is available at http://framenet.icsi.
berkeley.edu
5                                                                      6
  This graph was produced using the FrameGrapher                        http://www.macle.nl/tud/LT/
                                                                       7
tool,  https://framenet.icsi.berkeley.edu/fndrupal/                     This dataset will also be made available to the research
FrameGrapher                                                           community in due time


                                                                  15
in mind that a certain amount of noise is also introduced                 frames based on our manual analysis (dark grey). If we go
into our dataset. Moreover, some frames such as Arriving                  to the level of the Events, we see that this already allows for
or Temporal collocation, are correctly labeled but do not                 finding more unique events per genre. Again, the Science
really contribute interesting semantic information.                       Fiction and Crime genre are best represented. When we
  For all books in our dataset we parsed the plots using                  had a closer look at other discriminating features we found
SEMAFOR, after which we also filtered out those frames                    the same tendency. In the Crime genre, for example, other
which can have the Event frame as a parent. Some data                     Events such as Verdict, Revenge, Execution, Robbery all
statistics regarding these annotations are presented in Ta-               appeared within the top twenty features.
ble 3, which reveal that the information we have available is                From this analysis we could deduce that both the frames
rather skewed.                                                            and events might deliver the same type of information as the
                                                                          LOD, with the events being more representative. However,
                                                                          what becomes clear is that the frames also contribute more
Table 3: Plot annotation statistics representing                          information. They can represent what is happening within
the average number of real and unique frames and                          a book. If we again consider our running example (cfr. Ta-
events per book and their standard deviations                             ble 2), which is classified as Fantasy, we feel that enriching
                                                                          a recommender with semantic frame, and especially with
                # Avg    Stdev     # Avg unique      Stdev                event information, might account for a better recommenda-
       Frames     197      205               96         61                tion. This brings us to the actual experiments.
       Events      42       45               22         15

                                                                          4.    EXPERIMENTS
3.2.2 Semantic frames versus Linked Open Data                                For our experiments we focus on the generation of new,
   As previously mentioned, we hypothesize that using frames              semantic features. In our experimental setting we aim to
might represent di↵erent information than using semantic                  evaluate the contribution of those features and thus do not
information represented in the LOD-cloud. The books dataset               explicitly focus on engineering towards a top recommenda-
we have at hand is particularly useful to verify this claim               tion performance.
since all books have been mapped to their DBpedia URIs.
   In order to do so, we relied on a manual subdivision of                4.1    Experimental Set-Up and Evaluation
all books in genres based on LOD. This classification was                    We opt to add our semantic features to an existing recom-
made by [23] by parsing the abstract (dbo: abstract),                     mender system [23], which participated, and performed well,
the genre ( dbo:literaryGenre, dbp:genre) and the subject                 in the ESWC’14 Challenge. Though we do apply feature
(dcterms: subject) of each book against a regular expres-                 weighting and feature selection as described below, the over-
sion pattern of thirty distinct genres. The authors performed             all item classification and collaborative-filtering elements of
this step to allow for more data coverage. However, by doing              the base system remain unchanged. This allows us to di-
so they also made a combination of various LOD informa-                   rectly compare the predictive power of the frame-based fea-
tion categories which enables us to directly compare these                tures with the DBpedia-based features used by the original
with our semantic frames. If we have a look at our running                system, in particular as both approaches are di↵erent uti-
example, The Prince and the Pilgrim,8 , we notice that this               lizations of the same information source, i.e. Wikipedia,
book is classified under the Fantasy genre.                               and dataset, i.e. the ESWC RecSys Challenge data.
   Based on this genre mapping, we calculated the gain ra-                   We use a reduced version of the dataset, based on a filter-
tio [19] of our semantic frames representation with relation              ing of the 5,063 books that were retained as having sufficient
to the genres, thus considering the frames as features allow-             plot information available (Section 3.2). This dataset has bi-
ing to do genre classification. These gain ratios can then                nary ratings and consists of 53,665 user-item-rating triples
be observed as feature weights, and ranked according to the               (6,162 users, 4,251 items) in the training data and 50,654
amount of information they add to discriminating between                  triples (6,180 users, 4,311 items) in the evaluation dataset.
the thirty possible genres. We start our analysis by first                   Even though this is a binary classification task, we opt to
only considering the semantic frame annotations. It became                output the positive class likelihood and not the final binary
apparent, however, that it might be more interesting to also              classification in order to avoid making a decision about the
closer inspect those frames which are Events since these in-              cut-o↵ for the likelihood values. Consequently, we evaluate
tuitively better represent what is actually happening.                    with root-mean-squared error (RMSE) to capture also the
   The result of these analyses in presented in Table 4. Be-              degree of confidence between the classification and the gold-
cause of space constraints, we only represent the five genres             standard test dataset9 . RMSE is calculated as:
representing most books of our dataset. This table each time                                          v
contains the ten top features (frames and events), i.e. those                                         u     m
                                                                                                      u1 X
with the highest gain ratio. The cell colour represents the                                RM SE = t          (Xi xi )2
                                                                                                        m i=1
manual analysis, indicated in light grey are those frames and
events occurring only within one particular genre. In darker
grey the frames and events which are representative for a                    in which Xi is the prediction and xi the response value,
specific genre are indicated. Regarding the frames, we see                i.e. the correct value for the task at hand, and m is the
that it is more difficult to find distinctive features correlating        number of items for which a prediction is made. Speaking
with the genre (light grey). In the upper part, only the Sci-             in practical terms, the lower the RMSE value the better,
ence Fiction and Crime genre contain truly representative                 9
                                                                            Obtained from the ESWC’14 Challenge Chairs upon re-
8
    http://dbpedia.org/resource/The Prince and the Pilgrim                quest.


                                                                     16
Table 4: Top ten features with the highest gain ratios in the five most popular LOD genres. Light grey cells
represent genre-unique and dark grey ones genre-representative features.

              Fantasy                   Science Fiction         History                    Children                 Crime
              Jury deliberation         Beyond compare          Representing               Memorization             Extradition
              Bond maturation           Becoming dry            Intentional traversing     Measure area             Go into shape
              Intentional traversing    Containment relation    Dominate competitor        Estimated value          Exporting
     FRAMES


              Cause to rot              Dunking                 Getting vehicle underway   Rope manipulation        Becoming dry
              Get a job                 Exclude member          Cause to rot               Degree of processing     Arson
              Beyond compare            Representing            Beyond compare             Jury deliberation        Measure area
              Representing              Jury deliberation       Probability                Bond maturation          Dominate competitor
              Locale by ownership       Cause to rot            Jury deliberation          Intentional traversing   Containment relation
              Ratification              Medium                  Color qualities            Cause to be dry          Reading aloud
              Commutation               Cause change of phase   Get a job                  Drop in on               Extreme point
              Surrendering possession   Change of consistency   Eventive a↵ecting          Intentionally a↵ect      Endangering
              Dodging                   Immobilization          Historic event             Examination              Posing as
              Immobilization            Execute plan            Extradition                Absorb heat              Experience bodily harm
              Renting                   Cause impact            Surrendering possession    Cause to experience      Enforcing
     EVENTS


              Reparation                Reparation              Corroding caused           Fighting activity        Cause to be wet
              Heralding                 Eventive a↵ecting       Dodging                    Dodging                  Intentionally a↵ect
              Soaking                   Get a job               Clemency                   Rope manipulation        Intercepting
              Intentional traversing    Cause to be sharp       Intentional traversing     Intentional traversing   Change resistance
              Cause to rot              Cause to rot            Cause to rot               Drop in on               Go into shape
              Get a job                 Cause change of phase   Get a job                  Cause to be dry          Extradition


because the closer the prediction confidence to the actual                    2. Frames
gold standard.
   In addition, again motivated by wanting to avoid to choose                  For the frames as such, we decided to include the resulting
a cut-o↵ point for the class assignment, we follow [12] and                 frame names (e.g. Killing, Kinship, Leadership) as a sepa-
evaluate with a receiver operating characteristic (ROC) curve               rate setting. In total this can lead to a maximum of 877 dis-
and also compute the area under the curve (AUC) for it.                     criminating features, which is a large feature space shrinkage
While in contrast to RMSE, the ROC curve is computed                        compared to the bag-of-words representation. This is why
only on the relative ordering of the predictions sorted by con-             we decided to also take into consideration those particular
fidence values, it o↵ers the advantage of understanding how                 words evoking the frames, the Lexical Units (e.g. murdered,
a classifier would perform given di↵erent cut-o↵ values. In                 father, Prince) on the one hand, and the lexical representa-
addition, with ROC we can compare against recommender                       tions of the Frame Elements – the semantic roles – evoked
systems that output only an (implicit) ranking and no class                 by this frame on the other hand (e.g. Prince Baudouin, by
confidence values.                                                          the King of Cronwall, King March). In a final setting, we
   The base system by [23] we extend is a simple content-                   incrementally combine these various elements of data, thus
based recommender which trains two Naı̈ve Bayes classi-                     giving more information to our classifier.
fiers10 on book features acquired from DBpedia, one global
                                                                              3. Events
classifier as background model and one per-user classifier to
capture individual preferences, trained on a user-neighborhood                 As was illustrated in Section 3.2 the Events occurring
of variable size. In our experiments, we leave this setting un-             within a book seem to intuitively represent important in-
changed but only vary the di↵erent features for item repre-                 formation of what is actually happening. This is why we
sentation. We experimented with five di↵erent feature rep-                  also decided to perform the same experiments as with the
resentations, which is explained in closer detail in the next               frames but, this time only incorporating those frames which
section.                                                                    have a possible Event parent somewhere in the FrameNet
4.2            Feature Representation                                       hierarchy. Looking only at the Events further reduced our
                                                                            feature space to a maximum of 234 features. We therefore
         1. Baselines                                                       also made the same combinations as mentioned above with
                                                                            all possible LUs and FEs relating only to Events.
  First, we established two baselines: the first baseline was
constructed by including the majority class based on the                      4. Taxonomy
training data, in our case the majority class is ‘0’. As a sec-
ond baseline we decided to include a bag-of-words approach                     In order to exploit the hierarchical structure of Frame-
containing token unigrams from all the di↵erent plots.                      Net even further, we decided to also investigate three other
  The next three groups of features all relate to the frame                 settings. First we explored whether including besides a
representation of the plots based on the SEMAFOR output                     frame also its direct parent, thus going one level up in the
(cfr. Section 3.2)                                                          graph, might help. We did the same in the other direction,
10                                                                          by only including the children which are at the bottom of
 Even though being a simple approach, in a preliminary                      our taxonomy (the leafs). Another way of incorporating this
experiment Naı̈ve Bayes outperformed an SVM, motivating
us to not compare di↵erent classifiers but focus on feature                 graph information was to calculate for each possible frame
selection. In addition, Naı̈ve Bayes was – as expected –                    pair that was found in a plot its least common subsumer [20]
significantly faster compared to other classifiers.                         (LCS), i.e. the parent both frames have in common resulting


                                                                       17
in the shortest path. Since the FrameNet taxonomy as such
is not hypercomplex, i.e. the maximum distance between                 Table 5: Experimental results on test dataset (N =
two frames is twelve, we decided to filter out those parents           50,654) with classifier trained on di↵erent feature
which are too generic by manually inspecting the LCS.11                types (best results per category in bold).
   For the four above-mentioned setups, the same feature                                Features                 RMSE        AUC
selection methods were employed. Of course in order to al-                  Baselines   Majority voting (0)       0.7705      n/a
low for a good representation, all word-based features (bow,                            Words as such            0.6145      0.5431
LUs and FEs) were first tokenized, stemmed and filtered on                  Frames      Frames as such            0.6272     0.5377
stop words. For the automatic feature selection, we first use                           Lexical units (LUs)       0.6266     0.5398
unsupervised feature attribute weighting by computing the                               Frame elements (FEs)     0.6036     0.5468
                                                                                        Frames + LUs              0.6259     0.5389
standard TF-IDF weights since all our features are in the
                                                                                        Frames + LUs + FEs       0.6036      0.5453
end derived from text (book plots).                                         Events      Events as such           0.6132      0.5148
                                                                                        Events + LUs              0.6259    0.5310
               TF    IDFi = ln(1 + tfi ) ln(N/dfi )
                                                                                        Events + LUs + FEs        0.6237     0.5296
Next, we use attribute selection by computing the gain ratio                Taxonomy    Frames One up            0.6244      0.5297
with relation to the binary class label in the training data:                           Frames Bottom             0.6253     0.5370
                                                                                        Frames + LCS              0.6285    0.5376
     RG (Attr, Class) = (H(Class)   H(Class|Attr))/H(Attr)                  LOD         DBpedia features          0.6022    0.5588
                                                                                        DBpedia + FEs            0.5982      0.5498
This should allow us to filter out noise or unimportant fea-                            (DBpedia + FEs hybrid)   (0.5664)   (0.5571)
tures. We keep only those features with a gain ratio larger
than zero (RG > 0).12
                                                                       Contrary to our expectations, our settings with only frames
      5. Linked Open Data (LOD)                                        or events do not outperform this baseline. We do see that
                                                                       the events as such, which constitute a much smaller feature
   In a final setup we compare our best setting with the LOD           space, perform slightly better than the frames. The bag-
features used by the base system, i.e. properties and values           of-words baseline is only outperformed when using features
from DBpedia, and apply the same feature weighting and                 actually presenting some sort of word filtering mechanism:
selection process. The features in the base system were man-           the Frame Elements are the lexical representation of words
ually selected and contain explicit book attributes, as e.g.           which are evoked by certain frames in the form of semantic
dbo:author (db:Umberto_Eco), but also categorical infor-               roles. Even though these features are extracted from the
mation as dbo:literaryGenre (db:Historical_novel), dc-                 text, it performs better than the bag-of-words (Words as
terms:subject (category:Novels_set_in_Italy) or rdf:-                  such) baseline approach (0.6145) which does not make use
type (yago:PhilosophicalNovels) and untyped Wikipedia                  of any semantic information. Analyzing the RG -ranked fea-
links in general.                                                      ture attributes revealed that also for the other best frame
   We use the same set of features as reported by [23], but,           approach Frames+LUs+FEs, the dominant attributes are
to remain consistent across all experimental settings, apply           the Frame elements, these were ranked highest. What is
our feature selection and weighting approach and use our               strange is that we do not find a similar trend when perform-
reduced training and test dataset. In addition, we tested              ing the same combination with our Event frames. This is
the combination of the DBpedia features with our best-                 probably because the feature space is too small to make a
performing frame approach.                                             well-informed decision.
                                                                          Figure 2 presents the ROC curves for our features, for
5.      RESULTS                                                        the sake of readability only the most interesting curves are
   We report experimental results for the di↵erent feature             plotted. As to be expected from the AUC values, all curves
settings in Table 5. Overall, the two best performing frame            are very close together. Besides not being far away from the
features are the Frame elements and the Frames+LUs+FEs,                diagonal, for no curve a clear cut-o↵ value is recognizable.
both achieve an RMSE of 0.6036. We see that the best                   We observe that the DBpedia features are slightly better for
result is obtained when making the combination between                 the left and partially the middle part of the curve, leading to
the Frame elements and the LOD system, RMSE of 0.5982.                 the interpretation that those features are superior for recom-
Looking at the AUC for the ROC curve, both features still              mender systems which focus on quality. Comparing the best
perform very well, but not as good as the DBpedia features             frames-based approach (FEs) with the bag-of-words baseline
alone, which achieve the best overall AUC of 0.5588.                   (Words), we see that FEs are mostly better than just words,
   Considering the RMSE values, we observe that the ma-                with some exception around a false positive rate of around
jority baseline is easily outperformed by all di↵erent set-            0.23.
tings. Looking at the bag-of-words baseline, however, illus-              We also compare our system with the hybrid recommender
trates that having the words of the plot available for rec-            system from Ristoski at. al. [21] (AUC 0.5848), which was
ommendation is already a quite difficult to beat baseline.             the second best system of the ESWC challenge and per-
11
                                                                       formed essentially equally well as the winning system. That
   We looked at the most frequent LCS nodes and excluded               system combined many di↵erent features, not only LOD,
 the first 10 generic nodes such as Artifact, Relation, Inten-
 tionally a↵ect, Gradable attributes, Transitive action.               but also user ratings and explicit collaborative filtering ap-
12
   Preliminary experiments revealed that keeping all features          proaches.13
 as well as doing classifier-based features selection with OneR
                                                                       13
 [13] with 5-fold-cross-validation on the training data con-            As that system only outputs scores for the purpose of rank-
 stantly underperformed against this setting.                          ing, we transformed those into confidences by dividing each


                                                                  18
                                                  1.0
                                                              Words
                                                              FEs
                                                              DBpedia


                                                  0.8
                                                              DBpedia_FEs
                                                              Ristoski
                             True positive rate

                                                  0.6
                                                  0.4
                                                  0.2
                                                  0.0


                                                        0.0        0.2       0.4           0.6        0.8       1.0

                                                                            False positive rate


               Figure 2: ROC curve for selected features and the top-performing ESWC system [21]


   Looking at the ROC curve, it becomes clear that incorpo-                             to add semantic information to a content-based book rec-
rating more and diverse features is beneficial in this setting.                         ommender system. We directly compared the addition of
However, we have to note that this system combines di↵erent                             text internal semantic frame information with text external
recommenders using the Borda rank aggregation method,                                   ontological information based on Linked Open Data (LOD),
which was not learned on the training data but manually                                 a popular research strand.
selected while having knowledge about the test dataset (see                                We have shown that parsing the book plots with a state-of-
also the comment below on our own combination model).                                   the-art semantic frame parser, SEMAFOR, delivers valuable
   If we compare our best semantic frame results with the                               additional semantic information. This information could en-
systems leveraging Linked Data, we see that we achieve a                                able a system to fully grasp what is happening within a book.
better performance (RMSE of 0.6022) when using the DB-                                  One of the added values of FrameNet is that all frames are
pedia features alone and that we get the best overall results                           related in a taxonomy which allows you to pinpoint those
when combining both our best system with the Linked Data                                Events forming the key components of a book. Based on a
(RMSE of 0.5982). In this way it appears that combining                                 direct comparison between the frames and events and a list
semantic information from di↵erent sources, i.e. from the                               of genres derived from DBpedia attributes, we have shown
linguistically grounded frames features and the explicit, on-                           that although these data sources show some similarities, the
tology grounded DBpedia features, is beneficial in this set-                            semantic frames should be able to represent more specific
ting. The AUC results, however, do not corroborate this                                 information about what is happening in a particular book.
finding.                                                                                   In order to test this claim in closer detail, we have per-
   Last, when not learning LOD+FEs together in one model,                               formed experiments where the focus was on generating new
but separately and combine results with a simple linear com-                            semantic features and find out what these can contribute to
bination (these results are presented in brackets in Table 5),                          a book recommendation system using one global classifier
as also done by [1], with = 0.5, we achieve better results                              as background model and one per-user classifier. We see
(RMSE of 0.5664 and AUC of 0.5571). However, this no-                                   that exploiting semantic frame information outperforms a
table improvement depends in the end on our knowledge                                   standard bag-of-words unigram baseline and that especially
of the test dataset, as it influenced our choice of a linear                            incorporating frame elements and lexical units evoking the
combination, instead of learning the combination of the dif-                            frames allows for the best overall performance. If we com-
ferent classifiers on the training data. Strictly speaking, this                        pare our best system to a system levering semantic LOD
is thus not a valid experimental result, nevertheless it indi-                          information, we observe that our frames approach is not
cates there is most likely a better hybrid design with feature                          able to outperform this system. We do find, however, that
combinations that will better utilize the semantic frame fea-                           if we combine these two semantic information sources into
tures and should yield better results.                                                  one system we get the best overall performance. This might
                                                                                        indicate that combining semantic information from di↵er-
                                                                                        ent sources, i.e. from the linguistically grounded implicit
6.     CONCLUSION                                                                       frame features and the explicit, ontology grounded DBpedia
     In this paper we have presented an alternative approach                            features, is beneficial.
                                                                                           This work has inspired many ideas for future work. Con-
score by a constant.


                                                                                   19
sidering the current setup, we are aware that we completely            [8] C. J. Filmore. Frames and the semantics of
relied on the output of one semantic frame parser, i.e. SE-                understanding. Quaderni di Semantica, IV(2), 1985.
MAFOR. We believe that using a filtering mechanism be-                 [9] C. J. Filmore, C. R. Johnson, and M. R. L. Petruck.
forehand, e.g. to filter out those frames and or events which              Background to framenet. International Journal of
are less meaningful or noisy, or that by applying a di↵erent               Lexicography, 16(3):235–250, 2003.
parser or event extraction techniques new lights can be shed          [10] E. Gabrilovich and S. Markovitch. Computing
on the added value of this type of information. Also, since we             semantic relatedness using wikipedia-based explicit
now only relied on Wikipedia to extract book information,                  semantic analysis. In Proceedings of the 20th
we had to reduce an original larger book data. We realize                  International Joint Conference on Artifical
a lot of additional information about books can be found                   intelligence, 2007.
online, for example on Google Books, Amazon, GoodReads,               [11] B. Heitmann and C. Hayes. Using linked data to build
etcetera. Also the same techniques can be used to extract                  open, collaborative recommender systems. In AAAI
other types of information from both the items and users                   Spring Symposium: Linked Data Meets Artificial
under consideration for the recommendation task.                           Intelligence, 2010.
   As mentioned at the end of Section 5 we would like to fur-         [12] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and
ther investigate whether another hybrid design might yield                 J. T. Riedl. Evaluating collaborative filtering
better results. In this respect, it would be interesting to                recommender systems. ACM Transactions on
plug our semantic knowledge in a collaborative-filtering ap-               Information Systems, 22(1):5–53, 2004.
proach to see whether this can actually help the overall per-
                                                                      [13] R. C. Holte. Very simple classification rules perform
formance. Using our semantic frames we could also inspect
                                                                           well on most commonly used datasets. Machine
in closer detail typical problems recommender systems face
                                                                           Learning, 11(1):63–90, 1993.
such as cold-start and data sparsity.
                                                                      [14] J. Lehmann, R. Isele, M. Jakob, A. Jentzsch,
                                                                           D. Kontokostas, P. N. Mendes, S. Hellmann,
Acknowledgments                                                            M. Morsey, P. van Kleef, S. Auer, and C. Bizer.
The work presented in this paper has been partly funded by                 DBpedia – A Large-scale, Multilingual Knowledge
the PARIS project (IWT-SBO-Nr. 110067). Furthermore,                       Base Extracted from Wikipedia. Semantic Web
Orphée De Clercq was supported by an exchange grant from                  Journal, 2013.
the German Academic Exchange Service (DAAD STIBET                     [15] P. Lops, M. Gemmis, and G. Semeraro. Content-based
scholarship program). We would like to thank Christian                     recommender systems: State of the art and trends. In
Meilicke for his help in providing the manually derived gen-               F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor,
res and his help with building the original ESWC recom-                    editors, Recommender Systems Handbook, pages
mender system.                                                             73–105. Springer US, 2011.
                                                                      [16] G. Martı́n, S. Schockaert, C. Cornelis, and
                                                                           H. Naessens. An exploratory study on content-based
7.   REFERENCES                                                            filtering of call for papers. Multidisciplinary
 [1] C. Basu, H. Hirsh, W. Cohen, et al. Recommendation                    Information Retrieval, Lecture Notes in Computer
     as classification: Using social and content-based                     Science, 8201:58–69, 2013.
     information in recommendation. In AAAI’98, pages                 [17] V. Ostuni, T. Di Noia, E. Di Sciascio, and R. Mirizzi.
     714–720, 1998.                                                        Top-n recommendations from implicit feedback
 [2] D. Billsus and M. J. Pazzani. User modeling for                       leveraging linked open data. In Proceedings of RecSys
     adaptive news access. User Modeling and                               2013, 2013.
     User-Adapted Interaction, 10(2-3):147–180, 2000.                 [18] A. Passant. dbrec: music recommendations using
 [3] M. Degemmis, P. Lops, and G. Semeraro. A                              dbpedia. In Proceedings of the 9th International
     content-collaborative recommender that exploits                       Semantic Web Conference (ISWC’10), 2010.
     wordnet-based user profiles for neighborhood                     [19] J. Quinlan. C4.5: Programs for Machine Learning.
     formation. User Modeling and User-Adapted                             Morgan Kaufmann, San Mateo, California, 1993.
     Interaction, 17(3):217–255, 2007.                                [20] P. Resnik. Using information content to evaluate
 [4] T. Di Noia, R. Mirizzi, V. Ostuni, D. Romito, and                     semantic similarity in a taxonomy. In Proceedings of
     M. Zanker. Linked open data to support content-based                  the 14th international joint conference on Artificial
     recommender systems. In I-SEMANTICS ’12                               intelligence (IJCAI’95), 1995.
     Proceedings of the 8th International Conference on               [21] P. Ristoski, E. L. Mencia, and H. Paulheim. A Hybrid
     Semantic Systems, 2012.                                               Multi-Strategy Recommender System Using Linked
 [5] D. Dipanjan, A. F. T. Martins, N. Schneider, and                      Open Data. In LOD-enabled Recommender Systems
     N. A. Smith. Frame-semantic parsing. Computational                    Challenge (ESWC 2014), 2014.
     Linguistics, 40(1):9–56, 2014.                                   [22] J. Ruppenhofer, M. Ellsworth, M. R. L. Petruck, and
 [6] M. Eirinaki, M. Vazirgiannis, and I. Varlamis. Sewep:                 C. R. Johnson. Framenet ii: Extended theory and
     using site semantics and a taxonomy to enhance the                    practice. Technical report, 2005.
     web personalization process. In Proceedings of the               [23] M. Schuhmacher and C. Meilicke. Popular Books and
     ninth ACM SIGKDD international conference on                          Linked Data: Some Results for the ESWC-14 RecSys
     Knowledge discovery and data mining, 2003.                            Challenge. In LOD-enabled Recommender Systems
 [7] C. J. Filmore. Frame semantics. Linguistics in the                    Challenge (ESWC 2014), 2014.
     Morning Calm, pages 111–137, 1982.


                                                                 20