<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Linked Open Data-enabled Strategies for Top-N Recommendations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Content-based Recommender Systems</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Top-N recommenda- tions</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Implicit Feedback</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Linked Open Data</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>DBpedia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Cataldo Musto Dept. of Computer Science Univ. of Bari Aldo Moro</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Pasquale Lops Dept. of Computer Science Univ. of Bari Aldo Moro</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Pierpaolo Basile Dept. of Computer Science Univ. of Bari Aldo Moro</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Univ. of Bari Aldo Moro, Italy Univ. of Bari Aldo Moro</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2014</year>
      </pub-date>
      <fpage>48</fpage>
      <lpage>55</lpage>
      <abstract>
        <p>The huge amount of interlinked information referring to different domains, provided by the Linked Open Data (LOD) initiative, could be e↵ ectively exploited by recommender systems to deal with the cold-start and sparsity problems. In this paper we investigate the contribution of several features extracted from the Linked Open Data cloud to the accuracy of di↵ erent recommendation algorithms. We focus on the top-N recommendation task in presence of binary user feedback and cold-start situations, that is, predicting ratings for users who have a few past ratings, and predicting ratings of items that have been rated by a few users. Results show the potential of Linked Open Data-enabled approaches to outperform existing state-of-the-art algorithms.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. INTRODUCTION</p>
      <p>Recently, novel and more accessible forms of information
coming from di↵ erent open knowledge sources represent a
rapidly growing piece of the big data puzzle.</p>
      <p>Over the last years, more and more semantic data are
published following the Linked Data principles1, by connecting
information referring to geographical locations, people,
companies, book, scientific publications, films, music, TV and
1http://www.w3.org/DesignIssues/LinkedData.html</p>
      <p>
        Using open or pooled data from many sources, often
combined and linked with proprietary big data, can help develop
insights di cult to uncover with internal data alone [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and
can be e↵ ectively exploited by recommender systems to deal
with classical problems of cold-start and sparsity.
      </p>
      <p>On the other hand, the use of a huge amount of
interlinked data poses new challenges to recommender systems
researchers, who have to find e↵ ective ways to integrate such
knowledge into recommendation paradigms.</p>
      <p>This paper presents a preliminary investigation in which
we propose and evaluate di↵ erent ways of including several
kinds of Linked Open Data features in di↵ erent classes of
recommendation algorithms. The evaluation is focused on
the top-N recommendations task in presence of binary user
feedback and cold-start situations.</p>
      <p>
        This paper extends our previous work carried out to
participate to the Linked Open Data-enabled Recommender
Systems challenge3 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], by presenting results for new tested
algorithms, along with the various combinations of features.
Results show the potential of Linked Open Data-enabled
approaches to outperform existing state-of-the-art algorithms.
2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>
        Previous attempts to build recommender systems that
exploit Linked Open Data are presented in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], where a music
recommender system uses DBpedia to compute the Linked
Data Semantic Distance, which allows to provide
recommendations by computing the semantic distance for all artists
referenced in DBpedia.
      </p>
      <p>
        In that work, the semantics of the DBpedia relations is not
taken into account, di↵erently from the approach described
in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], where properties extracted from DBpedia and
LinkedMDB [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] are exploited to perform a semantic expansion of the
item descriptions, suitable for learning user profiles.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], DBpedia is used to enrich the playlists extracted
from a Facebook profile with new related artists. Each
artist in the original playlist is mapped to a DBpedia node,
and other similar artists are selected by taking into account
shared properties, such as the genre and the musical
category of the artist.
      </p>
      <p>
        DBpedia is also used in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] to capture the complex
relationships between users, items and entities by extracting the
paths that connect users to items, in order to compute
recommendations through a learning to rank algorithm called
SPRank. SPRank is a hybrid recommendation algorithm
able to compute top-N item recommendations from implicit
feedback, that e↵ectively incorporates ontological knowledge
coming from DBpedia (content-based part) with
collaborative user preferences (collaborative part) in a graph-based
setting. Starting from the common graph-based
representation of the content and collaborative data models, all the
paths connecting the user to an item are considered in
order to have a relevance score for that item. The more paths
between a user and an item, the more that item is relevant
to that user.
      </p>
      <p>
        The increasing interest in using Linked Open Data to
create a new breed of content-based recommender systems is
witnessed by the success of the recent Linked Open
Dataenabled Recommender Systems challenge held at the
European Semantic Web Conference (ESWC 2014). The contest
consisted of 3 tasks, namely rating prediction in cold-start
situations, top-N recommendation from binary user
feedback, and diversity. Interestingly, top-N recommendation
from binary user feedback was the task with the highest
number of participants. The best performing approach was
based on an ensemble of algorithms based on popularity,
Vector Space Model, Random Forests, Logistic Regression,
and PageRank, running on a diverse set of semantic features
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The performance of the single methods were aggregated
using the Borda count aggregation strategy. Most of the
techniques used in the contest are presented in this paper.
      </p>
      <p>
        Similarly to the best performing approach, the second best
performing one was based on the same ingredients [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
Indeed, it combined di↵erent base recommenders, such as
collaborative and content-based ones, with a non-personalized
recommender based on popularity. Content-based strategies
3challenges.2014.eswc-conferences.org/index.php/RecSys
leveraged various features sets created from DBpedia.
Additional Linked Open Data sources were explored, such as
British Library Bibliography4 and DBTropes5, even though
they did not provide meaningful features with respect to
those derived from DBpedia. The results of the individual
recommenders were combined using stacking regression and
rank aggregation using Borda.
3.
      </p>
    </sec>
    <sec id="sec-3">
      <title>METHODOLOGY</title>
      <p>Section 3.1 describes the set of die↵rent features extracted
from the Linked Open Data cloud, while Section 3.2 presents
di↵erent kinds of recommendation algorithms, i.e. those
based on vector space and probabilistic models, those based
on the use of classifiers, and graph-based algorithms, which
are fed in di↵erent ways by the features extracted from the
Linked Open Data cloud.
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Features extracted from the Linked Open</title>
    </sec>
    <sec id="sec-5">
      <title>Data cloud</title>
      <p>The use of Linked Open Data allows to bridge the gap
between the need of background data and the challenge to
devise novel advanced recommendation strategies.</p>
      <p>There are two main approaches to extract Linked Open
Data features to represent items:
1. use of the Uniform Resource Identifier (URI)</p>
      <sec id="sec-5-1">
        <title>2. use of entity linking algorithms.</title>
        <p>The first approach directly extracts DBpedia properties for
each item by using its Uniform Resource Identifier (URI).
URIs are the standard way to identify real-world entities,
and allow to define an entry point to DBpedia.</p>
        <p>However, DBpedia provides a huge set of properties for
each item, hence a proper strategy to select the most
valuable ones is necessary. We could manually identify and select
a subset of domain-dependent properties, or we could take
into account a subset of the most frequent ones.</p>
        <p>Referring to the book domain, in which we performed the
evaluation, we selected the 10 properties in Table 1, which
are both very frequent and representative of the specific
domain.</p>
        <p>Starting from these properties, further resources could be
recursively added. For example, starting from a book, we
could retrieve its author through the property
http://dbpedia.org/ontology/author
and then retrieve and link other resources by the same
author, or other genres of works by the same author.</p>
        <p>As an example, the resulting representation obtained for
the book The Great and Secret Show is provided in Figure
2. The book is linked to its author (Clive Barker ), to the
genre (Fantasy literature), and to the Wikipedia categories
(British fantasy novels and 1980s fantasy novels).
Furthermore, other books by Clive Barker are reported, such as
Books of Blood and Mister B. Gone.</p>
        <p>The second approach to extract LOD features uses entity
linking algorithms to identify a set of Wikipedia concepts
occurring in the item description. Next, those Wikipedia
concepts can be easily mapped to the corresponding DBpedia
nodes.
4http://bnb.data.bl.uk/
5http://skipforward.opendfki.de/wiki/DBTropes
Property
http://dbpedia.org/ontology/wikiPageWikiLink
http://purl.org/dc/terms/subject
http://dbpedia.org/property/genre
http://dbpedia.org/property/publisher
http://dbpedia.org/ontology/author
http://dbpedia.org/property/followedBy
http://dbpedia.org/property/precededBy
http://dbpedia.org/property/series
http://dbpedia.org/property/dewey
http://dbpedia.org/ontology/nonFictionSubject
Description
Link from a Wikipedia page to another Wikipedia
page. This property allows to take into account
other Wikipedia pages which are somehow related.
The topic of a book.</p>
        <p>The genre of a book.</p>
        <p>The publisher of a book.</p>
        <p>The author of a book.</p>
        <p>The book followed by a specific book.</p>
        <p>The book preceded by a specific book.</p>
        <p>The series of a book.</p>
        <p>The Dewey Decimal library Classification.</p>
        <p>The subject of a non-fiction book
(e.g.: history, biography, cookbook, ...).</p>
        <p>
          Several techniques can be adopted, such as Explicit
Semantic Analysis [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] or Tagme [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
        <p>In this work we adopt Tagme, that implements an anchor
disambiguation algorithm to produce a Wikipedia-based
representation of text fragments, where the most relevant
concepts occurring in the text are mapped to the Wikipedia
articles they refer to. Tagme performs a sort of feature
selection by filtering out the noise in text fragments, and its
main advantage is the ability to annotate very short texts.</p>
        <p>As an example, the resulting representation obtained for
the book The Great and Secret Show is provided in Figure
3. Interestingly, the technique is able to associate several
concepts which are somehow related to the book, and which
could be useful to provide accurate and diverse
recommendations, as well.</p>
        <p>All these features are used in di↵erent ways by the
di↵erent recommendation algorithms presented in the following
section. Details are reported in Section 4.2.
3.2</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Recommendation Algorithms</title>
      <p>We tested three di↵erent classes of algorithms for
generating top-N recommendations, by using several combinations
of features extracted from the Linked Open Data cloud.
3.2.1</p>
      <sec id="sec-6-1">
        <title>Algorithms based on the Vector Space and Probabilistic Models</title>
        <p>Most content-based recommender systems rely on simple
retrieval models to produce recommendations, such as
keyword matching or Vector Space Model (VSM).</p>
        <p>VSM emerged as one of the most e↵ective approaches in
the area of Information Retrieval, thanks to its good
compromise between e↵ectiveness and simplicity. Documents
and queries are represented by vectors in an n-dimensional
vector space, where n is the number of index terms (words,
stems, concepts, etc.).</p>
        <p>Formally, each document is represented by a vector of
weights, where weights indicate the degree of association
between the document and index terms.</p>
        <p>Given this representation, documents are ranked by
computing the distance between their vector representations and
the query vector. Let D = {d1, d2, ..., dN } denote a set of
documents or corpus, and T = {t1, t2, ..., tn} be the
dictionary, that is to say the set of words in the corpus. T is
obtained by applying some standard natural language
processing operations, such as tokenization, stopwords removal,
and stemming. Each document dj is represented as a vector
in a n-dimensional vector space, so dj = {w1j, w2j, ..., dnj},
where wkj is the weight for term tk in document dj. The
most common weighting scheme is the TF-IDF (Term
FrequencyInverse Document Frequency).</p>
        <p>In content-based recommender systems relying on VSM,
the query is the user profile, obtained as a combination of
the index terms occurring in the items liked by that user,
and recommendations are computed by applying a vector
similarity measure, such as the cosine coecient, between
the user profile and the items to be recommended in the
same vector space.</p>
        <p>
          However, VSM is not able to manage either the latent
semantics of each document or the position of the terms
occurring in it. Hence, we proposed an approach able to produce
a lightweight and implicit semantic representation of
documents (items and user profiles). The technique is based on
the distributional hypothesis, according to which “words that
occur in the same contexts tend to have similar meanings”
[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. This means that the meaning of a word is inferred by
analyzing its usage in large corpora of textual documents,
hence words are semantically similar to the extent that they
share contexts.
        </p>
        <p>
          The gist of the technique is presented in [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], in which a
novel content-based recommendation framework, called
enhanced Vector Space Model (eVSM), is described. eVSM
adopts a latent semantic representation of items in terms
of contexts, i.e. a term-context matrix is adopted, instead
of the classical term-document matrix adopted in the VSM.
The advantage is that the context can be adapted to the
specific granularity level of the representation required by the
application: for example, given a word, its context could
be either a single word it co-occurs with, a sentence, or the
whole document.
        </p>
        <p>
          The use of fine-grained representations of contexts calls
for specific techniques for reducing the dimensionality of
vectors. Besides the classical Latent Semantic Indexing, which
su↵ers of scalability issues, more scalable techniques were
investigated, such as Random Indexing [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], adopted in the
eVSM model.
        </p>
        <p>Random Indexing in an incremental method which allows
to reduce a vector space by projecting the points into a
randomly selected subspace of enough high dimensionality. The
goal of using eVSM is to compare a vector space
representation which adopts very few dimensions for representing
items, with respect to a classical VSM.</p>
        <p>
          As an alternative to VSM, we used the BM25 probabilistic
model [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], one of the most dominant retrieval paradigm
today. The ranking function for matching a query q (user
profile) and an item I is:
        </p>
        <p>R = X</p>
        <p>nt · (↵ + 1)
t2 q nt + ↵ · (1
+
av|Ig|dl )
· idf (t)
(1)
nt is frequency of t in the item I, ↵ and are free
parameters, avgdl is the average item length, and idf (t) is the IDF
of feature t:
idf (t) = log</p>
        <p>N</p>
        <p>df (t) + 0.5
df (t) + 0.5
df (t) is the number of items in which the feature t occurs,
N is the cardinality of the collection.</p>
        <p>
          For all the previous models we explicitly managed
negative preferences of users by adopting the vector negation
operator proposed in [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], based on the concept of
orthogonality between vectors.
(2)
        </p>
        <p>
          Several works generally rely on the Rocchio algorithm [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]
to incrementally refine the user profiles by exploiting positive
and negative feedback provided by users, even though the
method needs an extensive tuning of parameters for being
e↵ective.
        </p>
        <p>
          Negative relevance feedback is also discussed in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], in
which the idea of representing negation by subtracting an
unwanted vector from a query emerged, even if nothing
about how much to subtract is stated. Hence, vector
negation is built on the idea of subtracting exactly the right
amount to make the unwanted vector irrelevant to the
results we obtain.
        </p>
        <p>
          This removal operation is called vector negation, which is
related to the concept of orthogonality, and it is proposed in
[
          <xref ref-type="bibr" rid="ref23">23</xref>
          ].
3.2.2
        </p>
      </sec>
      <sec id="sec-6-2">
        <title>Algorithms based on Classifiers</title>
        <p>The recommendation process can be seen as a binary
classification task, in which each item has to be classified as
interesting or not with respect to the user preferences.</p>
        <p>
          We learned classifiers using two algorithms, namely
Random Forests (RF) [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and Logistic Regression (LR).
        </p>
        <p>RF is an ensemble learning method, combining di↵erent
tree predictors built using di↵erent samples of the training
data and random subsets of the data features. The class of
an item is determined by the majority voting of the classes
returned by the individual trees. The use of di↵erent
samples of the data from the same distribution and of di↵erent
sets of features for learning the individual trees prevent the
overfitting.</p>
        <p>LR is a supervised learning method for classification which
builds a linear model based on a transformed target variable.
3.2.3</p>
      </sec>
      <sec id="sec-6-3">
        <title>Graph-based Algorithms</title>
        <p>
          We adopted PageRank with Priors, widely used to obtain
an authority score for a node based on the network
connectivity. Di↵erently from PageRank, it is biased towards
the preferences of a specific user, by adopting a non-uniform
personalization vector to assign die↵rent weights to di↵erent
nodes [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
        <p>In order to run the PageRank, we need to represent data
using a graph model. To this purpose, users and items in
the dataset are represented as nodes of a graph, while links
are represented by the positive users’ feedback. The graph
may be enriched in die↵rent ways, for example exploiting
entities and relations coming from DBpedia: in this case the
whole graph would contain nodes representing users, items,
and entities, and edges representing items relevant to users,
and relations between entities. This unified representation
allows to take into account both collaborative and
contentbased features to produce recommendations.</p>
        <p>In the classic PageRank, the prior probability assigned to
each node is evenly distributed ( 1 , where N is the number of
N
nodes), while PageRank with Priors is biased towards some
nodes, i.e. the preferences of a specific user (see Section 4.2).
4.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>EXPERIMENTAL EVALUATION</title>
      <p>The goal of the experiments is to evaluate the
contribution of diverse combinations of features, including those
extracted from the Linked Open Data cloud, to the accuracy
of die↵rent classes of recommendation algorithms.</p>
      <p>The experiments that have been carried out try to answer
to the following questions:
1. Which is the contribution of the Linked Open Data
features to the accuracy of top-N recommendations
algorithms, in presence of binary user feedback and
cold-start situations?
2. Do the Linked Open Data-enabled approaches
outperform existing state-of-the-art recommendation
algorithms?
4.1</p>
    </sec>
    <sec id="sec-8">
      <title>Dataset</title>
      <p>The dataset used in the experiment is DBbook, coming
from the recent Linked-Open Data-enabled Recommender
Systems challenge. It contains user preferences retrieved
from the Web in the book domain. Each book is mapped
to the corresponding DBpedia URI, which can be used to
extract features from di↵erent datasets in the Linked Open
Data cloud.</p>
      <p>The training set released for the top-N recommendation
task contains 72,372 binary ratings provided by 6,181 users
on 6,733 items. The dataset sparsity is 99.83%, and the
distribution of ratings is reported in Table 2.</p>
      <p>The test set contains user-item pairs to rank in order to
produce a top-5 item recommendation list for each user, to
be evaluated using F1@5 accuracy measure.
4.2</p>
    </sec>
    <sec id="sec-9">
      <title>Experimental setup</title>
      <p>Each recommendation algorithm is fed by a diverse set of
features.</p>
      <p>Besides TAGME and LOD features, algorithms may also use
BASIC features, i.e. number of positive, number of
negative, and total number of feedbacks provided by users and
provided on items, ratio between positive, negative and
total number of feedbacks provided by users and provided on
items and CONTENT features, obtained by processing book
descriptions gathered from Wikipedia. A simple NLP pipeline
removes stopwords, and applies stemming. For books not
existing in Wikipedia, DBpedia abstracts were processed.</p>
      <p>For all the methods, the 5 most popular items are assigned
as liked to users with no positive ratings in the training set.
Indeed, 5.37 is the average number of positive ratings for
each user in the dataset (see Table 2).</p>
      <sec id="sec-9-1">
        <title>Algorithms based on the Vector Space and Probabilistic Models.</title>
        <p>Recommender systems relying on VSM and probabilistic
framework index items using CONTENT, TAGME and LOD
features, and use as query the user profile obtained by
combining all the index terms occurring in the items liked by that
user.</p>
        <p>
          Items in the test set are ranked by computing the
similarity with the user profile. For VSM and eVSM the cosine
measure is adopted, while Equation 1 is used for the
probabilistic model. According to the literature [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], parameters
↵ and are set to 1.6 and 0.75, respectively.
        </p>
      </sec>
      <sec id="sec-9-2">
        <title>Algorithms based on Classifiers.</title>
        <p>
          Classifiers based on Random Forests and Logistic
Regression are trained with examples represented using CONTENT,
TAGME and LOD features, and labeled with the binary ratings
provided by users. The value of each feature is the
number of times it occurs in each item, normalized in the [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ]
interval.
        </p>
        <p>The LR classifier always includes BASIC features in the
training examples, while these did not provide valuable
results for RF.</p>
        <p>The RF classifier used 1,500 trees to provide a good
tradeo↵ between accuracy and eciency.</p>
        <p>For Logistic Regression we adopted the implementation
provided by Liblinear6, while for Random Forests we adopted
the implementation provided by the Weka library7.</p>
        <p>Top-N recommendations are produced by ranking items
according to the probability of the class.</p>
      </sec>
      <sec id="sec-9-3">
        <title>Graph-based Algorithms.</title>
        <p>PageRank with Priors is performed (for each single user)
using graphs with di↵erent sets of nodes. Initially, only
users, items and links represented by the positive feedback
are included; next, we enriched the graph with the 10
properties extracted from DBpedia (see Section 3.1). Then, we
ran a second level expansion stage of the graph to retrieve
the following additional resources:
1. internal wiki links of the new added nodes
2. more generic categories according to the hierarchy in</p>
        <p>DBpedia</p>
        <sec id="sec-9-3-1">
          <title>3. resources of the same category</title>
        </sec>
        <sec id="sec-9-3-2">
          <title>4. resources of the same genre</title>
          <p>5. genres pertaining to the author of the book</p>
        </sec>
        <sec id="sec-9-3-3">
          <title>6. resources written by the author</title>
        </sec>
        <sec id="sec-9-3-4">
          <title>7. genres of the series the book belongs to.</title>
          <p>This process adds thousands of nodes to the original graph.
For this reason, we pruned the graph by removing nodes
which are neither users nor books and having a total
number of inlinks and outlinks less than 5. This graph eventually
consisted of 340,000 nodes and 6 millions links.</p>
          <p>The prior probabilities assigned to nodes depend on the
users’ preferences, and are assigned according to the
following heuristics: 80% of the total weight is evenly distributed
among items liked by users (0 assigned to disliked items),
20% is evenly distributed among the remaining nodes. We
ran the algorithm with a damping factor set to 0.85.</p>
          <p>We adopted the implementation of PageRank provided by
the Jung library8.</p>
          <p>The PageRank computed for each node is used to rank
items in the test set.
4.3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>Results</title>
      <p>Statistics about users</p>
      <sec id="sec-10-1">
        <title>Avg. ratings provided by users # of users who provided only negative ratings # of users having a number of positive ratings below the avg. # of users having more negative than positive ratings</title>
        <p>Statistics about items</p>
      </sec>
      <sec id="sec-10-2">
        <title>Avg. ratings received by items # of items with no positive ratings # of items having a number of positive ratings below the avg. # of items having more negative than positive ratings</title>
        <p>Value
Classifiers
VSM
eVSM
BM25
s
r
u
o
H</p>
        <p>The best configuration for eVSM adopts TAGME features
alone, and is significantly better than all the configurations
but the one combining CONTENT and TAGME features (p =
0.13). This could mean that the entity linking algorithm
is able to select the most important features in the book
descriptions, while CONTENT features introduce noise.</p>
        <p>For BM25, the best configuration with ALL the features
significantly outperforms all the others but the one
combining CONTENT and LOD features (p = 0.53).</p>
        <p>Surprisingly, there is no statistical di↵erence between the
best performing configuration for VSM and the best one for
BM25.</p>
        <p>A final remark is that eVSM performance is not
comparable to the other methods, even though it is worth noting that
it represents items using very low-dimensional vectors
(dimension=500), compared to VSM, which uses vectors whose
dimensionality is equal to the number of items (6,733).</p>
        <p>Figure 5 presents the results obtained by the classifiers.</p>
        <p>We note that Logistic Regression always outperforms
Random Forests, and provides better results than the vector
space and probabilistic models, regardless the set of adopted
features.</p>
        <p>The best result using Logistic Regression is obtained with
TAGME features alone. This configuration significantly
outperforms the one including CONTENT and LOD features (p &lt;
0.05), while it is not di↵erent with respect to the other
configurations. This is probably due to the high sparsity of
the feature vector used to represent each training example
(220,000 features).</p>
        <p>Random Forests classifiers outperform eVSM, but they are
0,5600
0,5500
0,5400
10,5300
F
0,5200
0,5100
0,5000
0,5600
0,5500
0,5400
10,5300
F
0,5200
0,5100
0,5000
worse than vector space and probabilistic models. The best
result is obtained using ALL features. Since Random Forests
classifiers are able to automatically perform feature
selection, this was an unexpected result which deserves further
investigations.</p>
        <p>Finally, Figure 6 presents the results obtained by the
PageRank with Priors algorithm.</p>
        <p>PageRank with Priors
No Content
10 properties
10 properties + 10 properties +
expansion stage + expansion stage</p>
        <p>pruning
PageRank</p>
        <p>Execution time</p>
        <p>When using PageRank with Priors, we observe the impact
of the graph size on both the accuracy and execution time.
Starting with a graph not including content information, we
observe the worst performance and the lowest execution time
(2 hours on an Intel i7 3Ghz 32Gb RAM - the algorithm
is performed for each user with di↵erent weights initially
assigned to the nodes).</p>
        <p>Enriching the graph with the 10 selected DBpedia
properties leads to an improvement of accuracy (p &lt; 0.001), and to
a 5 hours execution time. Running the expansion stage and
pruning of nodes as described in Section 4.2, the time needed
to run the algorithm increases to 14 hours and produces a
slight accuracy improvement (p &lt; 0.001). Results using the
graph with no pruning procedure are not di↵erent from the
previous method (p = 0.09), but its time complexity is not
acceptable. This call for a more ecient implementation of
the algorithm.</p>
        <p>To complete the empirical evaluation, we compare the best
performing configuration of each algorithm in each class,
with some state-of-the-art algorithms.</p>
        <p>More specifically, we report the performance of
user-touser and item-to-item collaborative filtering, besides two
nonpersonalized baselines based on popularity and random
recommendations.</p>
        <p>
          Furthermore, we report the results for two algorithms for
top-N recommendations from implicit feedback: an
extension of matrix factorization optimized for Bayesian
Personalized Ranking (BPRMF ) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] and SPRank [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], able to exploit
LInked Open Data knowledge bases to compute accurate
recommendations.
        </p>
        <p>
          Except for SPRank, we used the implementations
available in MyMediaLite 3.10 [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], using the default parameters.
        </p>
        <p>
          The analysis of results in Figure 7 unveils the diculty
of collaborative filtering algorithms to deal with the high
sparsity of the dataset (99.83%), and with the high number
of users who provided only negative preferences, or more
negative than positive ratings. It is unexpected the
better performance of BPRMF compared to SPRank,
di↵erently from previous results obtained on the MovieLens and
Last.fm datasets [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. It is also surprising the better
performance of simple algorithms based on the vector space and
probabilistic models with respect to matrix factorization.
        </p>
        <p>Overall Comparison
0,5600
0,5500
0,5400
F10,5300
0,5200
0,5100
0,5000</p>
        <p>The analysis of the previous results allows to conclude
that TAGME and LOD features have the potential to improve
the performance of several recommendation algorithms for
computing top-N recommendations from binary user
feedback.</p>
        <p>However, in order to generalize our preliminary results, it
is necessary to further investigate:
• the e↵ect of die↵rent levels of sparsity on the
recommendation accuracy: to this purpose, it is needed to
assess the extent to which LOD features are able to
improve the performance of recommendation algorithms
for die↵rent levels of sparsity
• the accuracy on other datasets to generalize our
conclusions: further experiments on die↵rent target
domains are needed. Indeed, di↵erent item types, such
as books, movies, news, songs have die↵rent
characteristics which could lead to di↵erent results. Moreover,
experiments on a much larger scale are needed
• the e↵ect of the selection of domain-specific DBpedia
properties to feed the recommendation algorithms: it
is needed to assess the e↵ect of the selection of
specific sets of properties on the performance of the
recommendation algorithms. Indeed, DBpedia contains a
huge number of properties, and their selection could
have a strong influence on the accuracy of the
recommendation methods. Our preliminary experiments
leverage 10 DBpedia properties which are both frequent
and representative of the specific domain, but a subset
of these properties, or a di↵erent set of features could
lead to di↵erent results.</p>
        <p>As future work, we will study the ee↵ct of enriching the
graph-based representation with DBpedia nodes extracted
from the Tagme entity linking algorithm.</p>
        <p>Indeed, using entity linking to access DBpedia knowledge
is innovative and avoids the need of explicitly finding URIs
for items, a complex process which may hinder the use of
the Linked Open Data. Hence, the use of entity linking
algorithms represents a novel way to access the DBpedia
knowledge through the analysis of the item descriptions, without
exploiting any explicit mapping of items to URIs.</p>
        <p>
          Furthermore, starting from the preliminary evaluation
carried out in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], we will thoroughly investigate the potential
of using the wealth of relations of LOD features to produce
not only accurate, but also diversified recommendation lists.
        </p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>Acknowledgments</title>
      <p>This work fulfils the research objectives of the project
“VINCENTE - A Virtual collective INtelligenCe ENvironment
to develop sustainable Technology Entrepreneurship
ecosystems” (PON 02 00563 3470993) funded by the Italian
Ministry of University and Research (MIUR).
5.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Musto</surname>
          </string-name>
          , M. de Gemmis,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lops</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Narducci</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Semeraro</surname>
          </string-name>
          .
          <article-title>Aggregation strategies for linked open data-enabled recommender systems</article-title>
          .
          <source>In European Semantic Web Conference (satellite Events)</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          .
          <article-title>The emerging web of linked data</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          ,
          <volume>24</volume>
          (
          <issue>5</issue>
          ):
          <fpage>87</fpage>
          -
          <lpage>92</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Breiman</surname>
          </string-name>
          .
          <article-title>Random forests</article-title>
          .
          <source>Machine Learning</source>
          ,
          <volume>45</volume>
          (
          <issue>1</issue>
          ):
          <fpage>5</fpage>
          -
          <lpage>32</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Chui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Manyika</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. V.</given-names>
            <surname>Kuiken</surname>
          </string-name>
          .
          <article-title>What executives should know about open data</article-title>
          .
          <source>McKinsey Quarterly</source>
          ,
          <year>January 2014</year>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Di</surname>
          </string-name>
          <string-name>
            <surname>Noia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mirizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. C.</given-names>
            <surname>Ostuni</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Romito</surname>
          </string-name>
          .
          <article-title>Exploiting the web of data in model-based recommender systems</article-title>
          . In P. Cunningham,
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Hurley</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Guy</surname>
          </string-name>
          , and S. S. Anand, editors,
          <source>Proceedings of the ACM Conference on Recommender Systems '12</source>
          , pages
          <fpage>253</fpage>
          -
          <lpage>256</lpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Dunlop</surname>
          </string-name>
          .
          <article-title>The e↵ect of accessing nonmatching documents on relevance feedback</article-title>
          .
          <source>ACM Trans. Inf</source>
          . Syst.,
          <volume>15</volume>
          :
          <fpage>137</fpage>
          -
          <lpage>153</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ferragina</surname>
          </string-name>
          and
          <string-name>
            <given-names>U.</given-names>
            <surname>Scaiella</surname>
          </string-name>
          .
          <article-title>Fast and Accurate Annotation of Short Texts with Wikipedia Pages</article-title>
          .
          <source>IEEE Software</source>
          ,
          <volume>29</volume>
          (
          <issue>1</issue>
          ):
          <fpage>70</fpage>
          -
          <lpage>75</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Gabrilovich</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Markovitch</surname>
          </string-name>
          .
          <article-title>Wikipedia-based semantic interpretation for natural language processing</article-title>
          .
          <source>Journal of Artificial Intelligence Research (JAIR)</source>
          ,
          <volume>34</volume>
          :
          <fpage>443</fpage>
          -
          <lpage>498</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gantner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Drumond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Freudenthaler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rendle</surname>
          </string-name>
          , and L.
          <string-name>
            <surname>Schmidt-Thieme</surname>
          </string-name>
          .
          <article-title>Learning attribute-to-feature mappings for cold-start recommendations</article-title>
          . In G. I.
          <string-name>
            <surname>Webb</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Gunopulos</surname>
          </string-name>
          , and X. Wu, editors,
          <source>10th IEEE International Conference on Data Mining</source>
          , pages
          <fpage>176</fpage>
          -
          <lpage>185</lpage>
          . IEEE Computer Society,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gantner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rendle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Freudenthaler</surname>
          </string-name>
          , and L.
          <string-name>
            <surname>Schmidt-Thieme</surname>
          </string-name>
          .
          <article-title>Mymedialite: a free recommender system library</article-title>
          . In B. Mobasher,
          <string-name>
            <given-names>R. D.</given-names>
            <surname>Burke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          , and G. Adomavicius, editors,
          <source>Proceedings of the ACM Conference on Recommender Systems '11</source>
          , pages
          <fpage>305</fpage>
          -
          <lpage>308</lpage>
          . ACM,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Z. S.</given-names>
            <surname>Harris</surname>
          </string-name>
          .
          <source>Mathematical Structures of Language</source>
          . Interscience, New York”
          <year>1968</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>O.</given-names>
            <surname>Hassanzadeh</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. P.</given-names>
            <surname>Consens</surname>
          </string-name>
          .
          <article-title>Linked movie data base</article-title>
          . In C.Bizer,
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          K. Idehen, editors,
          <source>Proceedings of the WWW2009 Workshop on Linked Data on the Web, LDOW</source>
          <year>2009</year>
          , volume
          <volume>538</volume>
          <source>of CEUR Workshop Proceedings. CEUR-WS.org</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>T. H.</given-names>
            <surname>Haveliwala</surname>
          </string-name>
          .
          <article-title>Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search</article-title>
          .
          <source>IEEE Trans. Knowl</source>
          . Data Eng.,
          <volume>15</volume>
          (
          <issue>4</issue>
          ):
          <fpage>784</fpage>
          -
          <lpage>796</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Musto</surname>
          </string-name>
          .
          <article-title>Enhanced vector space models for content-based recommender systems</article-title>
          .
          <source>In Proceedings of the ACM Conference on Recommneder Systems '10</source>
          , pages
          <fpage>361</fpage>
          -
          <lpage>364</lpage>
          . ACM,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Musto</surname>
          </string-name>
          , G. Semeraro,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lops</surname>
          </string-name>
          , M. de Gemmis, and
          <string-name>
            <given-names>F.</given-names>
            <surname>Narducci</surname>
          </string-name>
          .
          <article-title>Leveraging social media sources to generate personalized music playlists</article-title>
          . In C. Huemer and P. Lops, editors,
          <source>E-Commerce and Web Technologies - 13th International Conference, EC-Web</source>
          <year>2012</year>
          , volume
          <volume>123</volume>
          <source>of Lecture Notes in Business Information Processing</source>
          , pages
          <fpage>112</fpage>
          -
          <lpage>123</lpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>V. C.</given-names>
            <surname>Ostuni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. Di</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Di</given-names>
            <surname>Sciascio</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Mirizzi</surname>
          </string-name>
          .
          <article-title>Top-n recommendations from implicit feedback leveraging linked open data</article-title>
          .
          <source>In Q. Yang</source>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>King</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pu</surname>
          </string-name>
          , and G. Karypis, editors,
          <source>Proceedings of the ACM Conference on Recommender Systems '13</source>
          , pages
          <fpage>85</fpage>
          -
          <lpage>92</lpage>
          . ACM,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Passant</surname>
          </string-name>
          . dbrec
          <article-title>- Music Recommendations Using DBpedia</article-title>
          . In International Semantic Web Conference,
          <source>Revised Papers</source>
          , volume
          <volume>6497</volume>
          <source>of LNCS</source>
          , pages
          <fpage>209</fpage>
          -
          <lpage>224</lpage>
          . Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ristoski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Mencia</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Paulheim</surname>
          </string-name>
          .
          <article-title>A hybrid multi-strategy recommender system using linked open data</article-title>
          .
          <source>In European Semantic Web Conference (Satellite Events)</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Robertson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Walker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Beaulieu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gull</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Lau</surname>
          </string-name>
          . Okapi at TREC.
          <source>In Text REtrieval Conference</source>
          , pages
          <fpage>21</fpage>
          -
          <lpage>30</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Robertson</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Zaragoza</surname>
          </string-name>
          .
          <article-title>The probabilistic relevance framework: Bm25 and beyond</article-title>
          .
          <source>Foundations and Trends in Information Retrieval</source>
          ,
          <volume>3</volume>
          (
          <issue>4</issue>
          ):
          <fpage>333</fpage>
          -
          <lpage>389</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Rocchio</surname>
          </string-name>
          .
          <article-title>Relevance Feedback Information Retrieval</article-title>
          . In Gerald Salton, editor,
          <source>The SMART retrieval system - experiments in automated document processing</source>
          , pages
          <fpage>313</fpage>
          -
          <lpage>323</lpage>
          . Prentice-Hall, Englewood Cli↵s, NJ,
          <year>1971</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sahlgren</surname>
          </string-name>
          .
          <article-title>An introduction to random indexing</article-title>
          .
          <source>In Proc. of the Methods and Applications of Semantic Indexing Workshop at the 7th Int. Conf. on Terminology and Knowledge Engineering</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>D.</given-names>
            <surname>Widdows</surname>
          </string-name>
          .
          <article-title>Orthogonal negation in vector spaces for modelling word-meanings and document retrieval</article-title>
          .
          <source>In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <fpage>136</fpage>
          -
          <lpage>143</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>