Exploiting FrameNet for Content-Based Book Recommendation Orphée De Clercq Michael Schuhmacher Simone Paolo Ponzetto LT3, Language and Research Group Data and Research Group Data and Translation Technology Team Web Science Web Science Ghent University University of Mannheim University of Mannheim orphee.declercq@ugent.be michael@informatik.uni- simone@informatik.uni- mannheim.de mannheim.de Véronique Hoste LT3, Language and Translation Technology Team Ghent University veronique.hoste@ugent.be ABSTRACT Keywords Adding semantic knowledge to a content-based recommender Content-Based Recommender Systems, Semantic Frame, Linked helps to better understand the items and user representa- Data tions. Most recent research has focused on examining the added value of adding semantic features based on structured 1. INTRODUCTION web data, in particular Linked Open Data (LOD). In this paper, we focus in contrast on semantic feature construc- Recommender systems are omnipresent online and consti- tion from text, by incorporating features based on semantic tute a significant part of the marketing strategy of various frames into a book recommendation classifier. To this pur- companies. In recent years, a lot of advances have been made pose we leverage the semantic frames based on parsing the in constructing collaborative filtering systems, whereas the plots of the items under consideration with a state-of-the- research on content-based recommenders had lagged some- art semantic parser. By investigating this type of seman- what behind. Similar to evolutions in information retrieval tic information, we show that these frames are also able to research, the focus has been more on optimizing tools and represent information about a particular book, but without finding more sophisticated techniques leveraging for exam- the need of having explicitly structured data describing the ple big data than on the actual understanding or processing books available. We reveal that exploiting frame informa- of the items or text at hand. tion outperforms a basic bag-of-words approach and that In Natural Language Processing (NLP), on the other hand, especially the words relating to those frames are beneficial huge advances have been made in processing text both from for classification. In a final step we compare and combine a lexical and semantic perspective. In this respect, we be- our system with the LOD features from a system leverag- lieve it is important to test whether a content-based recom- ing DBpedia as knowledge resource. We show that both mender system might actually benefit from plugging in more approaches yield similar results and reveal that combining semantically enriched text features, which is the purpose of semantic information from these two di↵erent sources might the current research. In this paper we wish to investigate even be beneficial. to what extent leveraging semantic frame information can help in recommending books to users. We chose to work with books, since these typically contain a chronological de- scription of certain actions or events which might be in- Categories and Subject Descriptors dicative for the interests of a particular reader. Someone might enjoy reading historical novels, for example, but is H.3 [Information Storage and Retrieval]: Content Anal- more prone to those novels where a love history is explained ysis and Indexing; H.4 [Information Systems Applica- in closer detail than those where a typical revenge story is tions]: Miscellaneous portrayed. We hypothesize that the semantic frames and or events in these two types of historical novels will be di↵erent. In other words, we wish to investigate to what extent deep semantic parsing of the plots describing a book following the Permission to make digital or hard copies of all or part of this work for FrameNet paradigm can help for recommendation. personal or classroom use is granted without fee provided that copies are In order to validate these claims we performed an exten- not made or distributed for profit or commercial advantage and that copies sive analysis on a book recommendation dataset which was bear this notice and the full citation on the first page. To copy otherwise, to provided in the framework of the 2014 ESWC challenge. Copyright 2014 for the individual papers by the paper’s authors. republish, to post on servers or to redistribute to lists, requires prior specific What is particularly interesting about this dataset is that Copying permitted permission for private and academic purposes. This volume is and/or a fee. published 2014, CBRecSys and copyrighted by its Silicon October 6, 2014, editors.Valley, CA, USA. all the books have been mapped to their corresponding DB- Copyright CBRecSys20142014,byOctober the author(s). 6, 2014, Silicon Valley, CA, USA. pedia URIs which allows us to directly compare externally 14 gained semantic information as available in the Linked Open Data cloud (LOD) with internal semantic information based Table 1: Example of a frame on the plots themselves. Our analysis reveals that although Frame: KILLING some frames and events are good indicators of genres derived The KILLER or CAUSE causes the death of the VICTIM. from external DBpedia information, they do represent some KILLER John drawned Martha. additional information which might help the recommenda- VICTIM I saw heretics beheaded. CAUSE The rockslide killed nearly half of FEs tion process. To actually verify this finding we test the added value the climbers. of incorporating frame information as semantic features in INSTRUMENT It’s difficult to suicide with only a basic recommender system. We see that exploiting this a pocketknife. kind of semantic information outperforms a standard bag- ..., kill.v, killer.n, killing.n, lethal.a, liquidate.v, liqui- LUs of-words unigram baseline and that incorporating frame el- dation.n, liquidator.n, lynch.v, massacre.n,massacre.v, ements and lexical units evoking the frames allows for the matricide.n, murder.n, murder.v, murderer.n,... best overall performance. If we compare our best system to a system levering semantic LOD information, we observe that our frames approach is not able to outperform this sys- base and LinkedMDB as the only background knowledge for a tem. We do find, however, that if we combine these two movie recommender system and show that thanks to this on- semantic information sources into one system we get the tological information the quality of a standard content-based best overall performance. This might indicate that combin- system can be improved. In more recent work, the seman- ing semantic information from di↵erent sources, i.e. from tic item descriptions based on LOD have been merged with the linguistically grounded implicit frame features and the positive implicit feedback in a graph-based representation explicit, ontology grounded DBpedia features, is beneficial. to produce a hybrid top-N item recommendation algorithm, The remainder of this paper is structured as follows. In SPrank [17], which further underlines the added value of this Section 2 we describe some related work with an explicit kind of data. Moreover, in 2014 in order to spark research focus on the added value of semantic information for recom- on LOD and content-based recommender systems, a shared mender systems. In Section 3 we then explain in closer detail task was organized by the same authors, i.e. the ESWC-14 the construction and reasoning behind the semantic frame- Challenge1 . enhancement. We then continue by describing the actual In content-based recommendation, the advances that have experimental setup (Section 4) and have a closer analysis been made were made possible thanks to the availability of of the results (Section 5). We finish with some concluding designated datasets. These include data for predicting mu- remarks and ideas for future work (Section 6). sic, Last.FM2 , and or movies, MovieLens3 . Up till now little research has been performed on other genres, such as books. 2. RELATED WORK The ESWC challenge, however, made a book recommenda- tion dataset available which is mapped to DBpedia. DB- In content-based recommender systems, the items to be pedia is a crowd-sourced community e↵ort to extract struc- recommended are represented by a set of features based on tured information from Wikipedia and makes them available their content, whereas a user is represented by his profile. as linked RDF data [14]. This dataset will be used as our To build a recommender both information sources are com- main data source. In this paper, we focus on the feature pared. Most content-based recommenders use quite simple construction for a classifier in that we also incorporate se- retrieval models, such as keyword matching or the vector mantic features based on the semantic frames present within space model with basic TF-IDF weighting [15]. A prob- the items to be recommended. This is, to our knowledge, lem with these models is that they tend to ignore semantic the first approach that tries to leverage this kind of data and information. To overcome this one can use Explicit Seman- is one way of tackling the issue of Limited Content Analysis tic Analysis (ESA) [10] instead of TF-IDF weighting which within recommender systems [4]. In order to validate these allows to represent a document as a weighted vector of con- claims we will compare and combine our best system with a cepts. Another way to add more linguistic knowledge is to system exploiting LOD. use for example information from Wordnet as done by [6, 3]. An alternative is to use language models to represent doc- uments. This was done for example by [16] when exploring 3. FRAME-ENHANCEMENT content-based filtering of calls for papers. Besides retrieval In this section we give some more information about why models, machine learning techniques where a system learns we believe exploiting frame information might help with rec- the user profile and classifies items as interesting or not are ommendations. First, we introduce some basic concepts and also used for content-based recommenders. One of the first theory after which we explain how we apply a state-of-the- to do this was [2] using a Naı̈ve Bayes classifier. art semantic frame parser to our dataset and provide a first When it comes to adding semantic information to rec- analysis. We hypothesize that a plot description tells more ommender systems we see that currently leveraging Linked about a book than using more global semantic classification Open Data (LOD) is a popular research strand. [11] and based on external semantic information as provided by the [18] were among the first to use LOD for recommendation. LOD cloud. This reasoning can be transferred to other data The former use this information to build open recommender sources having a large number of textual information. systems whereas the latter built a music recommender using collaborative filtering techniques. [4] was the first to really 1 http://challenges.2014.eswc-conferences.org/ leverage LOD to build a content-based recommender and index.php/RecSys 2 the first to exploit the semantics of the relations in the link http://labrosa.ee.columbia.edu/millionsong/ 3 hierarchy. They use LOD information from DBpedia, Free- http://grouplens.org/datasets/movielens/ 14 Table 2: Example of two sentences of a plot descrip- tion and its resulting frames. The [Prince], the protagonist, is [named] Alexander. His [father], [Prince] Baudouin, is [murdered] by PLOT the [King] of Cornwall, [King] [March]. [When] Alexander [comes] of [age], he [sets out] to Camelot to [seek] justice from [King] Arthur and to [avenge] the [death] of his [father].... Leadership, Appointing, Kinship, Leadership, Killing, FRAMES Figure 1: Example of Inheritance relations related Leadership, Leadership, Calendric unit, to the KILLING frame. Temporal collocation,Arriving, Calendric unit, Departing,Seeking to achieve, Leadership, Revenge, Death, Kinship. 3.1 Frame semantics and FrameNet Following the basic assumption that the meanings of most words can best be understood on the basis of a seman- elaborated version of the LibraryThing dataset6 . This dataset tic frame, FrameNet [9] was developed as a linguistic re- contains books that are part of a particular user’s online source storing considerable information about lexical and catalog containing the books he/she has read or owns. For predicate-arguments semantics in English. the challenge, the books available in the dataset have been FrameNet is grounded in the theory of frame semantics [7, mapped to their corresponding DBpedia URIs [17]. Based 8]. This theory tries to describe the meaning of a sentence on the available information we were able to download the by characterizing the background knowledge required to un- plot description of each book from its corresponding Wikipedia derstand this sentence. This knowledge is presented in an page (this plot information is lacking in DBpedia). In this idealized, i.e. prototypical, form. A frame is thus a struc- way we envisaged to investigate whether knowing more about tured representation of a concept. It can be a description what is actually happening in a book can enhance the rec- of a type of event, relation or entity, and the participants ommendation. We worked with a subset by only including in it. In Table 1 we present an example of such a frame, books of which a uniform and unambiguous DBpedia link KILLING. We see it is a semantic class containing various was available and that actually contained plot information predicates, also known as lexical units (LUs), evoking the on Wikipedia. In total our final dataset contains 5,063 books described situation, e.g. killer, murder, lethal. Moreover, with an average plot length of 312 words7 . it illustrates that within FrameNet each frame comes with In order to annotate the semantic frames, each plot was a set of semantic roles, i.e. frame elements (FEs), which parsed using the state-of-the-art frame-semantic parser SE- can be perceived as the participants and/or properties of a MAFOR [5]. This parser extracts semantic predicate- frame which are of course also lexicalized in the text itself, argument structures from text using a statistical model and e.g. Killer: John, Instrument: with only a pocketknife. is trained on the FrameNet 1.5 release. It takes as input the FrameNet’s latest release (1.5) contains 877 frames and text as such, performs some preprocessing steps and outputs about 155K exemplar sentences.4 An interesting aspect of on a sentence-per-sentence basis all frames that are present the FrameNet lexicon is that asymmetric frame relations can within a text. These frames are represented by one of the 877 relate two frames, thus forming a complex hierarchy contain- possible frame names and also the lexical units and frame ing both is-a like and non-hierarchical relations [22]. In this elements (both generic and lexicalized form) are output. An work, we are particularly interested in the former type, also example is presented in Table 2. This is the plot description known as Inheritance relations. This type of relation entails of the book The Prince and the Pilgrim. In the text itself, that the child frame is a subtype of the parent frame. If we the lexical units evoking the frames are indicated in square look for instance at our Killing example, of which the taxon- brackets. The frames and LUs which are represented in bold omy is visualized in Figure 15 , we are able to find out that are those frames which actually constitute an Event. Find- this frame is a child of the frame Transitive action, which is ing out which books are events can be done by exploiting in turn a child of both the frame Objective Influence and, the taxonomy (cfr. supra) which enables us in a way to find more interestingly, the frame Event. This taxonomy thus out more semantic properties of specific frames. Intuitively, enables us to find even more semantic properties about spe- we can state that especially those Event frames give most cific frames. information about what is happening within a book: the above-mentioned book is clearly a revenge story. However, 3.2 Exploiting FrameNet the other frames might also pinpoint important aspects, e.g. the repetition of the Leadership and Kinship frames could 3.2.1 Book dataset inform us that this novel is about royalty and family. For the research described in this paper, we worked with What this example also illustrates is that the SEMAFOR the dataset of the ESWC challenge which is in fact a re- parser is not 100% accurate. For example, the name of a particular king – King March – is interpreted by the parser 4 as evoking the frame Calendric unit. We should thus keep This release is available at http://framenet.icsi. berkeley.edu 5 6 This graph was produced using the FrameGrapher http://www.macle.nl/tud/LT/ 7 tool, https://framenet.icsi.berkeley.edu/fndrupal/ This dataset will also be made available to the research FrameGrapher community in due time 15 in mind that a certain amount of noise is also introduced frames based on our manual analysis (dark grey). If we go into our dataset. Moreover, some frames such as Arriving to the level of the Events, we see that this already allows for or Temporal collocation, are correctly labeled but do not finding more unique events per genre. Again, the Science really contribute interesting semantic information. Fiction and Crime genre are best represented. When we For all books in our dataset we parsed the plots using had a closer look at other discriminating features we found SEMAFOR, after which we also filtered out those frames the same tendency. In the Crime genre, for example, other which can have the Event frame as a parent. Some data Events such as Verdict, Revenge, Execution, Robbery all statistics regarding these annotations are presented in Ta- appeared within the top twenty features. ble 3, which reveal that the information we have available is From this analysis we could deduce that both the frames rather skewed. and events might deliver the same type of information as the LOD, with the events being more representative. However, what becomes clear is that the frames also contribute more Table 3: Plot annotation statistics representing information. They can represent what is happening within the average number of real and unique frames and a book. If we again consider our running example (cfr. Ta- events per book and their standard deviations ble 2), which is classified as Fantasy, we feel that enriching a recommender with semantic frame, and especially with # Avg Stdev # Avg unique Stdev event information, might account for a better recommenda- Frames 197 205 96 61 tion. This brings us to the actual experiments. Events 42 45 22 15 4. EXPERIMENTS 3.2.2 Semantic frames versus Linked Open Data For our experiments we focus on the generation of new, As previously mentioned, we hypothesize that using frames semantic features. In our experimental setting we aim to might represent di↵erent information than using semantic evaluate the contribution of those features and thus do not information represented in the LOD-cloud. The books dataset explicitly focus on engineering towards a top recommenda- we have at hand is particularly useful to verify this claim tion performance. since all books have been mapped to their DBpedia URIs. In order to do so, we relied on a manual subdivision of 4.1 Experimental Set-Up and Evaluation all books in genres based on LOD. This classification was We opt to add our semantic features to an existing recom- made by [23] by parsing the abstract (dbo: abstract), mender system [23], which participated, and performed well, the genre ( dbo:literaryGenre, dbp:genre) and the subject in the ESWC’14 Challenge. Though we do apply feature (dcterms: subject) of each book against a regular expres- weighting and feature selection as described below, the over- sion pattern of thirty distinct genres. The authors performed all item classification and collaborative-filtering elements of this step to allow for more data coverage. However, by doing the base system remain unchanged. This allows us to di- so they also made a combination of various LOD informa- rectly compare the predictive power of the frame-based fea- tion categories which enables us to directly compare these tures with the DBpedia-based features used by the original with our semantic frames. If we have a look at our running system, in particular as both approaches are di↵erent uti- example, The Prince and the Pilgrim,8 , we notice that this lizations of the same information source, i.e. Wikipedia, book is classified under the Fantasy genre. and dataset, i.e. the ESWC RecSys Challenge data. Based on this genre mapping, we calculated the gain ra- We use a reduced version of the dataset, based on a filter- tio [19] of our semantic frames representation with relation ing of the 5,063 books that were retained as having sufficient to the genres, thus considering the frames as features allow- plot information available (Section 3.2). This dataset has bi- ing to do genre classification. These gain ratios can then nary ratings and consists of 53,665 user-item-rating triples be observed as feature weights, and ranked according to the (6,162 users, 4,251 items) in the training data and 50,654 amount of information they add to discriminating between triples (6,180 users, 4,311 items) in the evaluation dataset. the thirty possible genres. We start our analysis by first Even though this is a binary classification task, we opt to only considering the semantic frame annotations. It became output the positive class likelihood and not the final binary apparent, however, that it might be more interesting to also classification in order to avoid making a decision about the closer inspect those frames which are Events since these in- cut-o↵ for the likelihood values. Consequently, we evaluate tuitively better represent what is actually happening. with root-mean-squared error (RMSE) to capture also the The result of these analyses in presented in Table 4. Be- degree of confidence between the classification and the gold- cause of space constraints, we only represent the five genres standard test dataset9 . RMSE is calculated as: representing most books of our dataset. This table each time v contains the ten top features (frames and events), i.e. those u m u1 X with the highest gain ratio. The cell colour represents the RM SE = t (Xi xi )2 m i=1 manual analysis, indicated in light grey are those frames and events occurring only within one particular genre. In darker grey the frames and events which are representative for a in which Xi is the prediction and xi the response value, specific genre are indicated. Regarding the frames, we see i.e. the correct value for the task at hand, and m is the that it is more difficult to find distinctive features correlating number of items for which a prediction is made. Speaking with the genre (light grey). In the upper part, only the Sci- in practical terms, the lower the RMSE value the better, ence Fiction and Crime genre contain truly representative 9 Obtained from the ESWC’14 Challenge Chairs upon re- 8 http://dbpedia.org/resource/The Prince and the Pilgrim quest. 16 Table 4: Top ten features with the highest gain ratios in the five most popular LOD genres. Light grey cells represent genre-unique and dark grey ones genre-representative features. Fantasy Science Fiction History Children Crime Jury deliberation Beyond compare Representing Memorization Extradition Bond maturation Becoming dry Intentional traversing Measure area Go into shape Intentional traversing Containment relation Dominate competitor Estimated value Exporting FRAMES Cause to rot Dunking Getting vehicle underway Rope manipulation Becoming dry Get a job Exclude member Cause to rot Degree of processing Arson Beyond compare Representing Beyond compare Jury deliberation Measure area Representing Jury deliberation Probability Bond maturation Dominate competitor Locale by ownership Cause to rot Jury deliberation Intentional traversing Containment relation Ratification Medium Color qualities Cause to be dry Reading aloud Commutation Cause change of phase Get a job Drop in on Extreme point Surrendering possession Change of consistency Eventive a↵ecting Intentionally a↵ect Endangering Dodging Immobilization Historic event Examination Posing as Immobilization Execute plan Extradition Absorb heat Experience bodily harm Renting Cause impact Surrendering possession Cause to experience Enforcing EVENTS Reparation Reparation Corroding caused Fighting activity Cause to be wet Heralding Eventive a↵ecting Dodging Dodging Intentionally a↵ect Soaking Get a job Clemency Rope manipulation Intercepting Intentional traversing Cause to be sharp Intentional traversing Intentional traversing Change resistance Cause to rot Cause to rot Cause to rot Drop in on Go into shape Get a job Cause change of phase Get a job Cause to be dry Extradition because the closer the prediction confidence to the actual 2. Frames gold standard. In addition, again motivated by wanting to avoid to choose For the frames as such, we decided to include the resulting a cut-o↵ point for the class assignment, we follow [12] and frame names (e.g. Killing, Kinship, Leadership) as a sepa- evaluate with a receiver operating characteristic (ROC) curve rate setting. In total this can lead to a maximum of 877 dis- and also compute the area under the curve (AUC) for it. criminating features, which is a large feature space shrinkage While in contrast to RMSE, the ROC curve is computed compared to the bag-of-words representation. This is why only on the relative ordering of the predictions sorted by con- we decided to also take into consideration those particular fidence values, it o↵ers the advantage of understanding how words evoking the frames, the Lexical Units (e.g. murdered, a classifier would perform given di↵erent cut-o↵ values. In father, Prince) on the one hand, and the lexical representa- addition, with ROC we can compare against recommender tions of the Frame Elements – the semantic roles – evoked systems that output only an (implicit) ranking and no class by this frame on the other hand (e.g. Prince Baudouin, by confidence values. the King of Cronwall, King March). In a final setting, we The base system by [23] we extend is a simple content- incrementally combine these various elements of data, thus based recommender which trains two Naı̈ve Bayes classi- giving more information to our classifier. fiers10 on book features acquired from DBpedia, one global 3. Events classifier as background model and one per-user classifier to capture individual preferences, trained on a user-neighborhood As was illustrated in Section 3.2 the Events occurring of variable size. In our experiments, we leave this setting un- within a book seem to intuitively represent important in- changed but only vary the di↵erent features for item repre- formation of what is actually happening. This is why we sentation. We experimented with five di↵erent feature rep- also decided to perform the same experiments as with the resentations, which is explained in closer detail in the next frames but, this time only incorporating those frames which section. have a possible Event parent somewhere in the FrameNet 4.2 Feature Representation hierarchy. Looking only at the Events further reduced our feature space to a maximum of 234 features. We therefore 1. Baselines also made the same combinations as mentioned above with all possible LUs and FEs relating only to Events. First, we established two baselines: the first baseline was constructed by including the majority class based on the 4. Taxonomy training data, in our case the majority class is ‘0’. As a sec- ond baseline we decided to include a bag-of-words approach In order to exploit the hierarchical structure of Frame- containing token unigrams from all the di↵erent plots. Net even further, we decided to also investigate three other The next three groups of features all relate to the frame settings. First we explored whether including besides a representation of the plots based on the SEMAFOR output frame also its direct parent, thus going one level up in the (cfr. Section 3.2) graph, might help. We did the same in the other direction, 10 by only including the children which are at the bottom of Even though being a simple approach, in a preliminary our taxonomy (the leafs). Another way of incorporating this experiment Naı̈ve Bayes outperformed an SVM, motivating us to not compare di↵erent classifiers but focus on feature graph information was to calculate for each possible frame selection. In addition, Naı̈ve Bayes was – as expected – pair that was found in a plot its least common subsumer [20] significantly faster compared to other classifiers. (LCS), i.e. the parent both frames have in common resulting 17 in the shortest path. Since the FrameNet taxonomy as such is not hypercomplex, i.e. the maximum distance between Table 5: Experimental results on test dataset (N = two frames is twelve, we decided to filter out those parents 50,654) with classifier trained on di↵erent feature which are too generic by manually inspecting the LCS.11 types (best results per category in bold). For the four above-mentioned setups, the same feature Features RMSE AUC selection methods were employed. Of course in order to al- Baselines Majority voting (0) 0.7705 n/a low for a good representation, all word-based features (bow, Words as such 0.6145 0.5431 LUs and FEs) were first tokenized, stemmed and filtered on Frames Frames as such 0.6272 0.5377 stop words. For the automatic feature selection, we first use Lexical units (LUs) 0.6266 0.5398 unsupervised feature attribute weighting by computing the Frame elements (FEs) 0.6036 0.5468 Frames + LUs 0.6259 0.5389 standard TF-IDF weights since all our features are in the Frames + LUs + FEs 0.6036 0.5453 end derived from text (book plots). Events Events as such 0.6132 0.5148 Events + LUs 0.6259 0.5310 TF IDFi = ln(1 + tfi ) ln(N/dfi ) Events + LUs + FEs 0.6237 0.5296 Next, we use attribute selection by computing the gain ratio Taxonomy Frames One up 0.6244 0.5297 with relation to the binary class label in the training data: Frames Bottom 0.6253 0.5370 Frames + LCS 0.6285 0.5376 RG (Attr, Class) = (H(Class) H(Class|Attr))/H(Attr) LOD DBpedia features 0.6022 0.5588 DBpedia + FEs 0.5982 0.5498 This should allow us to filter out noise or unimportant fea- (DBpedia + FEs hybrid) (0.5664) (0.5571) tures. We keep only those features with a gain ratio larger than zero (RG > 0).12 Contrary to our expectations, our settings with only frames 5. Linked Open Data (LOD) or events do not outperform this baseline. We do see that the events as such, which constitute a much smaller feature In a final setup we compare our best setting with the LOD space, perform slightly better than the frames. The bag- features used by the base system, i.e. properties and values of-words baseline is only outperformed when using features from DBpedia, and apply the same feature weighting and actually presenting some sort of word filtering mechanism: selection process. The features in the base system were man- the Frame Elements are the lexical representation of words ually selected and contain explicit book attributes, as e.g. which are evoked by certain frames in the form of semantic dbo:author (db:Umberto_Eco), but also categorical infor- roles. Even though these features are extracted from the mation as dbo:literaryGenre (db:Historical_novel), dc- text, it performs better than the bag-of-words (Words as terms:subject (category:Novels_set_in_Italy) or rdf:- such) baseline approach (0.6145) which does not make use type (yago:PhilosophicalNovels) and untyped Wikipedia of any semantic information. Analyzing the RG -ranked fea- links in general. ture attributes revealed that also for the other best frame We use the same set of features as reported by [23], but, approach Frames+LUs+FEs, the dominant attributes are to remain consistent across all experimental settings, apply the Frame elements, these were ranked highest. What is our feature selection and weighting approach and use our strange is that we do not find a similar trend when perform- reduced training and test dataset. In addition, we tested ing the same combination with our Event frames. This is the combination of the DBpedia features with our best- probably because the feature space is too small to make a performing frame approach. well-informed decision. Figure 2 presents the ROC curves for our features, for 5. RESULTS the sake of readability only the most interesting curves are We report experimental results for the di↵erent feature plotted. As to be expected from the AUC values, all curves settings in Table 5. Overall, the two best performing frame are very close together. Besides not being far away from the features are the Frame elements and the Frames+LUs+FEs, diagonal, for no curve a clear cut-o↵ value is recognizable. both achieve an RMSE of 0.6036. We see that the best We observe that the DBpedia features are slightly better for result is obtained when making the combination between the left and partially the middle part of the curve, leading to the Frame elements and the LOD system, RMSE of 0.5982. the interpretation that those features are superior for recom- Looking at the AUC for the ROC curve, both features still mender systems which focus on quality. Comparing the best perform very well, but not as good as the DBpedia features frames-based approach (FEs) with the bag-of-words baseline alone, which achieve the best overall AUC of 0.5588. (Words), we see that FEs are mostly better than just words, Considering the RMSE values, we observe that the ma- with some exception around a false positive rate of around jority baseline is easily outperformed by all di↵erent set- 0.23. tings. Looking at the bag-of-words baseline, however, illus- We also compare our system with the hybrid recommender trates that having the words of the plot available for rec- system from Ristoski at. al. [21] (AUC 0.5848), which was ommendation is already a quite difficult to beat baseline. the second best system of the ESWC challenge and per- 11 formed essentially equally well as the winning system. That We looked at the most frequent LCS nodes and excluded system combined many di↵erent features, not only LOD, the first 10 generic nodes such as Artifact, Relation, Inten- tionally a↵ect, Gradable attributes, Transitive action. but also user ratings and explicit collaborative filtering ap- 12 Preliminary experiments revealed that keeping all features proaches.13 as well as doing classifier-based features selection with OneR 13 [13] with 5-fold-cross-validation on the training data con- As that system only outputs scores for the purpose of rank- stantly underperformed against this setting. ing, we transformed those into confidences by dividing each 18 1.0 Words FEs DBpedia 0.8 DBpedia_FEs Ristoski True positive rate 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 False positive rate Figure 2: ROC curve for selected features and the top-performing ESWC system [21] Looking at the ROC curve, it becomes clear that incorpo- to add semantic information to a content-based book rec- rating more and diverse features is beneficial in this setting. ommender system. We directly compared the addition of However, we have to note that this system combines di↵erent text internal semantic frame information with text external recommenders using the Borda rank aggregation method, ontological information based on Linked Open Data (LOD), which was not learned on the training data but manually a popular research strand. selected while having knowledge about the test dataset (see We have shown that parsing the book plots with a state-of- also the comment below on our own combination model). the-art semantic frame parser, SEMAFOR, delivers valuable If we compare our best semantic frame results with the additional semantic information. This information could en- systems leveraging Linked Data, we see that we achieve a able a system to fully grasp what is happening within a book. better performance (RMSE of 0.6022) when using the DB- One of the added values of FrameNet is that all frames are pedia features alone and that we get the best overall results related in a taxonomy which allows you to pinpoint those when combining both our best system with the Linked Data Events forming the key components of a book. Based on a (RMSE of 0.5982). In this way it appears that combining direct comparison between the frames and events and a list semantic information from di↵erent sources, i.e. from the of genres derived from DBpedia attributes, we have shown linguistically grounded frames features and the explicit, on- that although these data sources show some similarities, the tology grounded DBpedia features, is beneficial in this set- semantic frames should be able to represent more specific ting. The AUC results, however, do not corroborate this information about what is happening in a particular book. finding. In order to test this claim in closer detail, we have per- Last, when not learning LOD+FEs together in one model, formed experiments where the focus was on generating new but separately and combine results with a simple linear com- semantic features and find out what these can contribute to bination (these results are presented in brackets in Table 5), a book recommendation system using one global classifier as also done by [1], with = 0.5, we achieve better results as background model and one per-user classifier. We see (RMSE of 0.5664 and AUC of 0.5571). However, this no- that exploiting semantic frame information outperforms a table improvement depends in the end on our knowledge standard bag-of-words unigram baseline and that especially of the test dataset, as it influenced our choice of a linear incorporating frame elements and lexical units evoking the combination, instead of learning the combination of the dif- frames allows for the best overall performance. If we com- ferent classifiers on the training data. Strictly speaking, this pare our best system to a system levering semantic LOD is thus not a valid experimental result, nevertheless it indi- information, we observe that our frames approach is not cates there is most likely a better hybrid design with feature able to outperform this system. We do find, however, that combinations that will better utilize the semantic frame fea- if we combine these two semantic information sources into tures and should yield better results. one system we get the best overall performance. This might indicate that combining semantic information from di↵er- ent sources, i.e. from the linguistically grounded implicit 6. CONCLUSION frame features and the explicit, ontology grounded DBpedia In this paper we have presented an alternative approach features, is beneficial. This work has inspired many ideas for future work. Con- score by a constant. 19 sidering the current setup, we are aware that we completely [8] C. J. Filmore. Frames and the semantics of relied on the output of one semantic frame parser, i.e. SE- understanding. Quaderni di Semantica, IV(2), 1985. MAFOR. We believe that using a filtering mechanism be- [9] C. J. Filmore, C. R. Johnson, and M. R. L. Petruck. forehand, e.g. to filter out those frames and or events which Background to framenet. International Journal of are less meaningful or noisy, or that by applying a di↵erent Lexicography, 16(3):235–250, 2003. parser or event extraction techniques new lights can be shed [10] E. Gabrilovich and S. Markovitch. Computing on the added value of this type of information. Also, since we semantic relatedness using wikipedia-based explicit now only relied on Wikipedia to extract book information, semantic analysis. In Proceedings of the 20th we had to reduce an original larger book data. We realize International Joint Conference on Artifical a lot of additional information about books can be found intelligence, 2007. online, for example on Google Books, Amazon, GoodReads, [11] B. Heitmann and C. Hayes. Using linked data to build etcetera. Also the same techniques can be used to extract open, collaborative recommender systems. In AAAI other types of information from both the items and users Spring Symposium: Linked Data Meets Artificial under consideration for the recommendation task. Intelligence, 2010. As mentioned at the end of Section 5 we would like to fur- [12] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and ther investigate whether another hybrid design might yield J. T. Riedl. Evaluating collaborative filtering better results. In this respect, it would be interesting to recommender systems. ACM Transactions on plug our semantic knowledge in a collaborative-filtering ap- Information Systems, 22(1):5–53, 2004. proach to see whether this can actually help the overall per- [13] R. C. Holte. Very simple classification rules perform formance. Using our semantic frames we could also inspect well on most commonly used datasets. Machine in closer detail typical problems recommender systems face Learning, 11(1):63–90, 1993. such as cold-start and data sparsity. [14] J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, Acknowledgments M. Morsey, P. van Kleef, S. Auer, and C. Bizer. The work presented in this paper has been partly funded by DBpedia – A Large-scale, Multilingual Knowledge the PARIS project (IWT-SBO-Nr. 110067). Furthermore, Base Extracted from Wikipedia. Semantic Web Orphée De Clercq was supported by an exchange grant from Journal, 2013. the German Academic Exchange Service (DAAD STIBET [15] P. Lops, M. Gemmis, and G. Semeraro. Content-based scholarship program). We would like to thank Christian recommender systems: State of the art and trends. In Meilicke for his help in providing the manually derived gen- F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, res and his help with building the original ESWC recom- editors, Recommender Systems Handbook, pages mender system. 73–105. Springer US, 2011. [16] G. Martı́n, S. Schockaert, C. Cornelis, and H. Naessens. An exploratory study on content-based 7. REFERENCES filtering of call for papers. Multidisciplinary [1] C. Basu, H. Hirsh, W. Cohen, et al. Recommendation Information Retrieval, Lecture Notes in Computer as classification: Using social and content-based Science, 8201:58–69, 2013. information in recommendation. In AAAI’98, pages [17] V. Ostuni, T. Di Noia, E. Di Sciascio, and R. Mirizzi. 714–720, 1998. Top-n recommendations from implicit feedback [2] D. Billsus and M. J. Pazzani. User modeling for leveraging linked open data. In Proceedings of RecSys adaptive news access. User Modeling and 2013, 2013. User-Adapted Interaction, 10(2-3):147–180, 2000. [18] A. Passant. dbrec: music recommendations using [3] M. Degemmis, P. Lops, and G. Semeraro. A dbpedia. In Proceedings of the 9th International content-collaborative recommender that exploits Semantic Web Conference (ISWC’10), 2010. wordnet-based user profiles for neighborhood [19] J. Quinlan. C4.5: Programs for Machine Learning. formation. User Modeling and User-Adapted Morgan Kaufmann, San Mateo, California, 1993. Interaction, 17(3):217–255, 2007. [20] P. Resnik. Using information content to evaluate [4] T. Di Noia, R. Mirizzi, V. Ostuni, D. Romito, and semantic similarity in a taxonomy. In Proceedings of M. Zanker. Linked open data to support content-based the 14th international joint conference on Artificial recommender systems. In I-SEMANTICS ’12 intelligence (IJCAI’95), 1995. Proceedings of the 8th International Conference on [21] P. Ristoski, E. L. Mencia, and H. Paulheim. A Hybrid Semantic Systems, 2012. Multi-Strategy Recommender System Using Linked [5] D. Dipanjan, A. F. T. Martins, N. Schneider, and Open Data. In LOD-enabled Recommender Systems N. A. Smith. Frame-semantic parsing. Computational Challenge (ESWC 2014), 2014. Linguistics, 40(1):9–56, 2014. [22] J. Ruppenhofer, M. Ellsworth, M. R. L. Petruck, and [6] M. Eirinaki, M. Vazirgiannis, and I. Varlamis. Sewep: C. R. Johnson. Framenet ii: Extended theory and using site semantics and a taxonomy to enhance the practice. Technical report, 2005. web personalization process. In Proceedings of the [23] M. Schuhmacher and C. Meilicke. Popular Books and ninth ACM SIGKDD international conference on Linked Data: Some Results for the ESWC-14 RecSys Knowledge discovery and data mining, 2003. Challenge. In LOD-enabled Recommender Systems [7] C. J. Filmore. Frame semantics. Linguistics in the Challenge (ESWC 2014), 2014. Morning Calm, pages 111–137, 1982. 20