<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Learning to Identify Complementary Products from DBpedia</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anas Alzoghbi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science University of Freiburg Georges-Köhler-Allee 051</institution>
          ,
          <addr-line>79110 Freiburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Identifying the complementary relationship between products, like a cartridge to a printer, is a very useful technique to provide recommendations. These are typically purchased together or within a short time frame and thus online retailers bene t from it. Existing approaches rely heavily on transactions and therefore they su er from: (1) the cold start problem for new products; (2) the inability to produce good results for infrequently bought products; (3) the inability to explain why two products are complementary. We propose a framework that aims at alleviating these problems by exploiting a knowledge graph (DBpedia) in addition to products' available information such as titles, descriptions and categories rather than transactions. Our problem is modeled as a classi cation task on a set of product pairs. Our starting point is the semantic paths in the knowledge graph linking between product attributes, from which we model product features. Then, having a labeled set of product pairs we learn a model; and nally, we use this model to predict complementary products for an unseen set of products. Our experiments on a real world dataset from Amazon show high performance of our framework in predicting whether one product is complementary to another one.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Online retailers like Amazon or eBay sell millions of
products from several categories and continuously expand their
catalogs. Being aware that browsing a large catalog can
be overwhelming for a user and can therefore have a
negative impact on revenue [
        <xref ref-type="bibr" rid="ref24 ref3">3, 24</xref>
        ], it is not a surprise that
many of these retailers are equipped with Recommender
Systems (RS) [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. While recommending products based on the
user's taste is certainly an important task any RS has to
carry out, special attention has to be paid at the moment
in which a purchase is about to be made. In this
circumstance the RS can expose the user interested in a product
to other products whose use is related to their desired
product. Those products are referred to as complementary
products [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. For instance, a user who placed a guitar in the
shopping cart might also be interested in purchasing a
guitar tuner, a case, or a book about learning how to play it. It
is important to notice that complementary products might
belong to a di erent category than that of the associated
product, e.g. Instrument Accessories, Bags &amp; Cases and
Books in the example above. This means that there are also
some relationships at the level of categories too. Some of
them are evident, like Instruments and Instrument
Accessories, while some of them are not immediately obvious like
Instruments and Books.
      </p>
      <p>
        The majority of the approaches to nd complementary
products make use of transactional data, e.g. by
extracting association rules or by applying other data mining
techniques [
        <xref ref-type="bibr" rid="ref22 ref27 ref31 ref32 ref7">7, 22, 27, 31, 32</xref>
        ]. This means that new catalog
products have no chance to appear as complementary to any
other product. In addition to this, these approaches might
also produce noisy recommendations for products which are
infrequently purchased and are not able to explain the
reason for the recommendation [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
      </p>
      <p>Therefore, a model is required from which this relationship
between a complementary product and its associated
product can be learned. We believe that a graph representation is
the most suitable for this. Products, their attributes and the
categories to which they belong, as well as the
interconnections between them can be represented as nodes and paths
in the graph. Having such a model would not only allow
us to nd links between a complementary product and its
associated product without any need for transactions, but
also to provide explanations of why this relationship exists.</p>
      <p>However, learning to discriminate between di erent kinds
of relationships from those interconnections within a
complex semantic graph model is still an open problem. See
for instance the feature vector illustrated in Figure 1, which
represents a pair of products (pi; pj). Each of the features
should indeed re ect a property which characterizes both
products as a whole, or a certain interaction between those.
A class label, complementary or non-complementary is
associated with each sample. While trying to t a classi
cation model which learns how to discriminate the
relationship from a set of samples like those might seem the obvious
way of solving the problem, the problem remains: what are
those features that re ect some property of pairs of
products? What do they re ect? Is this property related to the
fact that one product might be complementary to the other
one? This work aims at addressing these questions.
Contributions. The contributions of this paper can be
summarized as follows.</p>
      <p>1. We propose CompLoD, a novel and extensible
framework that predicts the complementary relationship
between two products by using only textual information
such as titles, descriptions and categories.
2. The approach does not require transactions as other
approaches and therefore, it is immune to the
coldstart problem for new items.
3. Our scheme bridges the gap of how to encode
information from a knowledge graph into a relevant feature
set to t a classi cation model and predict the
complementary relationship.
4. Rather than building a knowledge graph from scratch,
we believe that Semantic Web standards have enough
to o er in this regard. Therefore, we use DBpedia1, a
well-known Linked Open Data (LoD) dataset, as the
core of the knowledge graph. This dataset is the
structured counterpart of Wikipedia and it is modeled as an
RDF graph. We thereby demonstrate the usefulness of
Semantic Web data.</p>
      <p>
        We evaluate the accuracy of our approach by conducting
experiments using Amazon data [
        <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
        ] as our ground truth
and nd that it produces competitive results in comparison
to those showed in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] when taking into account that no
transaction data is used. Our paper is structured as
follows: We aim at providing some notation and de nitions
used within this paper in section 2. In section 3 the
workow of CompLoD is explained in detail. Section 4 shows
the experiments that have been carried out to validate our
system. In section 5 we present the related work. Finally,
we present our conclusions in section 6.
      </p>
    </sec>
    <sec id="sec-2">
      <title>NOTATION</title>
      <p>This section provides some fundamental de nitions which
allow us to explain the system in detail. Our approach
requires modeling four elements within an RDF graph:
products, attributes and the categories to which products
belong in addition to the relationships between all of these
elements. Attributes are characteristics of a product which
are extracted from a product's speci cation. Categories are
instead given and are classi ed in a taxonomy. For instance,
bronze strings is an attribute of a guitar. Moreover a
guitar might belong to categories such as Musical instruments,
Guitars and Acoustic Guitars. Categories are hierarchically
1http://wiki.DBpedia.org/
arranged: acoustic guitars are a certain kind of guitar and
these are a certain kind of musical instrument.</p>
      <p>
        Since RDF is used as an underlying data model, we need
to introduce some notation. Let I, B and L be the set of all
IRIs (Internationalized Resource Identi ers ), the set of all
blank nodes and the set of all literals, respectively. Then a
triple (rs; p; ro) 2 (I [ B) I (I [ B [ L) is called
an RDF triple. An RDF graph G is a set of RDF triples.
Moreover, let V be a set of variables. A tuple of the set
(I [ L [ V ) (I [ V ) (I [ L [ V ), e.g. (?x; p; ro), is
called a triple pattern, in which variable names are pre xed
with the question mark (\?") symbol. A triple pattern can
be evaluated according to the SPARQL semantics to nd all
those mappings between variables and resources from the
graph which would make of that tuple a valid triple. For a
more comprehensive description of RDF and SPARQL we
refer the reader to [
        <xref ref-type="bibr" rid="ref13 ref25">13, 25</xref>
        ].
      </p>
      <p>A path in graph G between a source resource rs and a
target resource ro is denoted by (rs; ro)(p1;p2;:::;pt), where
(p1; p2; :::; pt) is the sequence of predicates that connects rs
to ro2. A path can be recursively de ned as follow:
(rs; ro)(p) is a path of length 1 if there exists a triple
(rs; p; ro) 2 G which connects rs to ro.
(rs; ro)(p1;p2;:::;pt) is a path of length t if there exists
a resource rl such that (rs; rl)(p1;p2;:::;pt 1) is a path
of length t 1 and (rl; ro)(pt) is a path of length 1.
In our RDF graph each of the above-mentioned elements,
products, attributes and categories, corresponds to an RDF
resource. Let P I = frp1 ; rp2 ; :::; rpn g be a subset of all
IRIs representing a set of products, A I = fra1 ; ra2 ; :::; ram g
is a set of products' attributes and C I = frc1 ; rc2 ; :::; rcl g
a set of products' categories. From now on products,
attributes and categories will denote the RDF resource which
identify these elements. Moreover, we will use rp, ra and
rc to denote an arbitrary product, attribute and category,
respectively. We assume that each product is connected to
its attributes and to the categories to which it belongs in
G. Moreover, there exist arbitrary connections in G among
attributes or categories, i.e. there are no restrictions on how
resources are connected.</p>
      <p>Problem statement. Given two products (rpi ; rpj ) in G,
we would like to predict whether rpj is complementary to
rpi using G. Let comp(rpi ; rpj ) be a function that returns 1
when this relationship holds, 0 otherwise. It is directional,
i.e. comp(rpi ; rpj ) = 1 doesn't imply that comp(rpj ; rpi )
also returns 13.
3.</p>
    </sec>
    <sec id="sec-3">
      <title>OUR APPROACH</title>
      <p>2For simplicity, we assume (p1; p2; :::; pt) identi es a unique
sequence of predicates which connect rs to ro.
3For instance, a guitar tuner is complementary to a guitar,
but the opposite does not necessarily hold true.
As previously mentioned, attributes are extracted from the
description of a product. Moreover, a category is a
structured piece of information organized as a taxonomy. The
categories go from generic (Electronics) to more speci c
(Mice). However, CompLoD doesn't make use of the
taxonomy and sees all categories as independent.</p>
      <p>Together all of this information is then processed in a
pipeline-like fashion. First the product's meta-data is
transformed into a structured graph in which product attributes
and categories are matched against DBpedia resources (A).
The resulting knowledge graph is a subgraph of DBpedia,
extended with the products represented as synthetic nodes.
Being aware that not all attributes have the same role in
characterizing a product and that not all the
interconnections are able to provide the same quality of information,
we decided to attach scores to the nodes and paths in the
graph using consolidated metrics (B). The results are used in
the next stage to build the feature set for pairs of products.
Finally, we train a classi cation model (C) and when this is
tted, the framework is able to answer the question whether
one product is complementary to another one. More details
about the individual tasks are provided in the following
sections.
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Building a knowledge graph</title>
      <p>
        Our framework requires a knowledge graph as the starting
point and we opted for DBpedia. In addition to the
deterministic rules which are used to build the graph, DBpedia
also has regularities with predictive power [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. To make use
of this, we need to integrate the products' meta-data into
it by extracting structured information from the text. We
therefore use AlchemyAPI, a NER tool. This identi es those
tokens which correspond to published LoD IRIs. Whereas
the tool supports the mapping to resources from several LoD
datasets4, we only store the information relative to DBpedia
and leave the possibility of exploiting multiple datasets to
future work. For instance, from the previous text some of
the identi ed resources are5:
      </p>
      <p>Categories</p>
      <p>Electronics: dbpedia:Electronics</p>
      <p>Computer &amp; Accessories: dbpedia:Computer, ...</p>
      <p>Attributes</p>
      <p>Logitech: dbpedia:Logitech, yago:Logitech, ...</p>
      <p>Operating system: dbpedia:Operating system, ...</p>
      <p>Microsoft Windows: dbpedia:Microsoft Windows
Mac OS X: dbpedia:Mac OS X, ...</p>
      <p>Microsoft: dbpedia:Microsoft, yago:Microsoft, ...</p>
      <p>IMac: dbpedia:IMac, yago:IMac, ...</p>
      <p>Windows Vista: dbpedia:Windows Vista, ...</p>
      <p>
        Example 2: Resources identi ed from the
meta-data of the product showed in example 1
It can be observed that some of the identi ed resources
correspond to categories, such as dbpedia:Electronics, while
others are attributes, such as dbpedia:Logitech (brand). To keep
the approach fully automated, we treat categories as text as
well. The quality of this task highly depends on the e
cacy of the NER tool, each of which has its own limitations.
While most of the important resources are identi ed, some
of them, like dbpedia:Computer mouse (category), which are
important to characterize the product, are not necessarily
recognized. For more details about the expected
performance of these tools we refer the reader to [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>Given the products and the identi ed resources, the RDF
graph is built as follows:
1. First we build a basic RDF graph Gp. We create one
resource rp for each product and connect it to its
attributes IRIs. Let Arp be the bag6 of attributes
identi ed in the input text of the product represented by
4Currently supported to the date of publication are
DBpedia, Freebase, US Census, GeoNames, UMBEL, OpenCyc,
YAGO, MusicBrainz, CIA Factbook and CrunchBase.
5We use pre xes to shorten the namespaces:
dbpedia: http://dbpedia.org/resource/
yago: http://yago-knowledge.org/resource/
6The tokens can appear multiple times in the text. We have
rp. We create a triple (rp; sp; ra) linking the product
rp with each ra 2 Arp using a synthetic predicate sp.
If an attribute ra appears more than once in Arp , we
create as many triples as the number of times ra
appears in Arp , where each triple has a di erent synthetic
predicate.
2. Let Crp be the set of categories to which the product
belongs. We also create triples (rp; sp; rc) that link
products with their categories.
3. We enrich Gp by extracting a subgraph from DBpedia.</p>
      <p>This subgraph contains all DBpedia resources
identied by the NER tool and all the resources reachable
within 2 hops from them. We ignore the direction of
the edges to do so. The reason why we limit the length
to 2 is to reduce the computational cost of computing
metrics on top of the graph (more details in the next
section). We merge Gp with the extracted subgraph
to form the Extended graph EGp. Note that attributes
and categories in EGp are interconnected.</p>
      <p>EGp is then the input of the next task, in which metrics
are computed to measure the importance of an attribute or
a category for a product as well as the importance of the
semantic paths interlinking them.
3.2</p>
    </sec>
    <sec id="sec-5">
      <title>Computation of metrics</title>
      <p>In our knowledge graph every product is connected to its
attributes and categories. However, those attributes might
have a distinct relevance in describing a product or even in
describing a category (seen as a set of products). Therefore,
we require some metrics that re ect the following intuition:
How well can an attribute characterize a product, e.g.
is equalizer more representative than rechargeable to
describe a speaker ? Section 3.2.1.</p>
      <p>How well can an attribute characterize a category, e.g.
is 4K more representative than wireless for the
category Television &amp; Video? Section 3.2.2.</p>
      <p>How relevant is the information carried by a path
connecting two resources? Let's assume, for instance, that
two product resources, a television (tv) and a speaker
(s), are connected through several paths of di erent
lengths in EGp.</p>
      <p>Is the path tv connectivit!y Bluetooth connectivity s that
goes through Bluetooth more important than another
one that goes through Stereo? Which of these paths
is the most informative? Does the length also play a
role? Suppose that the connection is now
tv connectivit!y Bluetooth typ!e W ireless connectivity s. Does
the informativeness decay with the length of the path?
Section 3.2.3.</p>
      <p>
        In order to provide answers to those questions and ful ll the
requirements we employ Term frequency-inverse document
frequency (TF-IDF)[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], a weighting scheme widely used in
Information Retrieval and text mining to determine how
important a term is to a document within a corpus. The design
of such metrics is achieved through di erent instantiations
of TF-IDF, i.e. we map di erent components of the RDF
to preserve the information of how often the attribute occurs
in order to compute metrics in the next stage.
graph to terms and documents. This requires the
consideration of a product rp as a bag of attributes fra1 ; ra2 ; :::; rak g.
A category rc can be also thought of as a bag of attributes,
namely those used to describe products of that category.
We will explain these metrics in more detail in the following
sections.
3.2.1 (Product) Attribute Frequency - Inverse
Product Frequency (PAF-IPF)
      </p>
      <p>The representation of a product as a bag of attributes
allows us to de ne a weighting scheme that re ects how well
an attribute ra is able to represent or describe a product rp.
We instantiate TF-IDF using a product as a document and
its attributes as the terms within the document.</p>
      <p>PAF-IPF (ra;rp) = P AF(ra;rp)</p>
      <p>IP F(ra)
IP F(ra) = log</p>
      <p>jP j
P F(ra)
Intuitively, P AF(ra;rp) counts the number of times a product
rp is described by an attribute ra, i.e. it counts the number
of triples that link the product to an attribute. In a similar
way P F(ra) counts the number of products that have ra as
an attribute. jP j is the cardinality of the set P , i.e. the
number of products available in G. The goal of IP F(ra) is
to reduce the score by a factor which is proportional to the
number of times the attribute is used to describe products.
If an attribute is common to all products then it cannot
strongly characterize the product.
3.2.2 (Category) Attribute Frequency - Inverse
Category Frequency (CAF-ICF)
In the same way we de ne the following:</p>
      <p>CAF-ICF (ra;rc) = CAF(ra;rc)</p>
      <p>ICF(ra)
ICF(ra) = log</p>
      <p>jCj
CF(ra)
which re ects to which extent an attribute characterizes a
category. CAF(ra;rc) counts the number of times that an
attribute resource ra is used to describe a product rp that
belongs to a category rc. CF(ra) counts the number of
categories in which the attribute ra is used to describe at least
one product which belongs to that category. jCj is the
cardinality of the set C. The role of ICF(ra) is the same as
that of IP F(ra).
3.2.3</p>
      <p>Path Informativeness</p>
      <p>
        In addition to the two metrics introduced above, we
require another one which re ects the degree of
informativeness a path carries with it. We use the de nition of path
informativeness presented in [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], which we tailored for our
needs. This concept is based on a similar instantiation of
TF-IDF. RDF resources r are used as documents. However,
two kinds of documents are considered: those in which r
appears as a subject of a triple and those in which it has
the role of an object. This allows us to de ne the following
metric:
Predicate frequency - Inverse Triple Frequency PF-ITF :
I-PF-ITF (p;r) = I-PF (p;r)
O-PF-ITF (p;r) = O-PF (p;r)
      </p>
      <p>IT F(p)</p>
      <p>IT F(p)
IT F(p) = log jT j
jT (p)j
I-PF-ITF (p;r) is the Input PF-ITF of a term p within the
document resource r and O-PF-ITF (p;r) is the Output
PFITF of a term p within the document resource r. I-PF (p;r)
counts the number of resources that can be mapped to the
variable ?x of the triple pattern (?x; p; r). This is also the
case for O-PF (p; r) in which the possible mappings for ?y
in (r; p; ?y) are counted. jT j is the overall number of triples
whereas jT (p)j is the number of triples in which p appears.
The role of IT F(p) is the same as for the previous
instantiations.</p>
      <p>This metric allows us to de ne the informativeness metric.
For a path (ri; rj )(p) of length 1 the informativeness I is
de ned as:</p>
      <p>I (ri;rj)(p) =</p>
      <p>O-PF-ITF (p;ri) + I-PF-ITF (p;rj)
2</p>
      <p>For paths of length t the informativeness is de ned as the
sum of the informativeness of its components divided by the
length of the path.</p>
      <p>(ri; rj )(p1;:::;pt 1;pt)
I (ri;rj)(p1;:::;pt 1;pt) = (I (ri;rk)(p1) + ::: +</p>
      <p>I (rq;rs)(pt 1) + I (rs;rj)(pt) )=t</p>
      <p>Given the previous de nitions, it is possible to de ne Imax,
i.e. the maximum informativeness carried by a path
connecting two resources ri and rj .</p>
      <p>Imax(ri; rj ) = maxfI(ri; rj ) 1 ; :::; I(ri; rj ) k g;
where i is an arbitrary path. Each pair of paths i and j
is di erent, i.e. there exists at least one predicate in both
paths which di ers. Similarly, it is possible to de ne Iavg,
the average informativeness of all paths between ri and rj .</p>
      <p>Iavg(ri; rj ) = (I(ri; rj ) 1 + ::: + I(ri; rj ) k )=k
To summarize. The metrics allow us to determine the
importance of attributes in describing a product or a
category, or to measure the amount of informativeness carried by
the semantic paths connecting two resources. These are
designed considering di erent granularities for the de nition of
document. In the case of PAF-IPF a document is a product
and the terms are the attributes that describe the product.
In the second case, CAF-ICF, a document is a category and
the terms are the attributes used to describe products of
that category. In the case of informativeness documents are
subject or object resources whereas predicates are the terms.
CompLoD computes the relevance metrics in EGp by
iterating over the set of identi ed attributes and then it proceeds
by searching paths linking between them.</p>
      <p>All these metrics have been validated by other works and
have been used to solve a variety of problems. However,
when one focuses again on the problem illustrated in
gure 1, i.e. the problem of designing a feature set that
applies to pairs of products, one can notice that these metrics
cannot be used directly as features. For instance, PAF-IPF
and CAF-ICF are metrics which apply to a single product.
While nothing prevents one from using them as features for
pairs, it is not possible to have each possible attribute as a
feature. This would lead to an extremely high-dimensional
feature vector. And whereas the path informativeness
between two products' attributes could be thought of as a
feature for pairs of products, one cannot have each possible
path as a feature either, for the same reason explained above
(high-dimensionality).</p>
      <p>We propose therefore a strategy to combine these metrics,
by following a very natural intuition which turned out to
produce useful results. This allows one to design a
lowdimensional feature vector which can be used to model pairs
of products.
3.3</p>
    </sec>
    <sec id="sec-6">
      <title>Features engineering</title>
      <p>
        To the best of our knowledge there is no work which
addresses the issue of which features for pairs of products
(rpi ; rpj ) can be used to predict the complementary
relationship. We aim at lling this gap. Our rationale is based on
the observation that for products belonging to di erent
domains there might exist correspondences and dependencies
between their attributes. This has been validated in
marketing, behavioral, and data mining research studies [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Our
hypothesis is that (1) the correspondences and dependencies
tell us also something about the complementary
relationship, and (2) this correspondence can be measured by the
informativeness carried in the interconnection between
attributes, weighted by those attributes' relative importance.
We therefore explain in this section how to leverage the
interconnections of the knowledge graph and to combine the
metric scores in a meaningful way.
      </p>
      <p>Let Arpi and Arpj be the bag of attributes of products
rpi and rpj . To simplify the explanation of the formulas to
calculate the features we will from now on assume that rai
and raj are attributes of rpi and rpj , respectively.
Moreover, we also assume that rci and rcj are the most speci c
categories of rpi and rpj . For the sake of readability we
introduce two weighting functions !P and !C shown in gure
3, which use PAF-IPF and CAF-ICF, respectively. In this
way, we can weigh a path between two attributes of rai and
raj according to their characterization relevance. Next, we
will present the design choices we made.</p>
      <p>Maximal Informativeness over the most relevant
product attributes (MIMPA): this feature combines PAF-IPF
and the concept of informativeness as follows. Given a
product rpi , we rst search for the attribute rai that has the
maximum PAF-IPF(rai ;rpi ) among other attributes. We do
the same for rpj . Let rai Mp be these attributes for rpi</p>
      <p>Mp and raj
and rpj , respectively. These are the most relevant product
attributes. The feature is then designed as follows:
M IM P A(rpi ;rpj ) = Imax raMip ; raMjp
Recall that when computing Imax(raMip ; raMjp ) all possible
paths which connect rai Mp are considered and from</p>
      <p>Mp to raj
these, it returns the max. informativeness value.</p>
      <p>Maximal Informativeness over the most relevant
category attributes (MIMCA): we focus again on the
attribute bags of both products. However, rather than
searching for those which characterize the products the best, we
take a look to the most speci c category of each product and
leverage CAF-ICF to nd the attribute which characterizes
this category the best. Let raMic and raMjc be these attributes
for rpi and rpj , respectively. Then:</p>
      <p>M IM CA(rpi ; rpj ) = Imax raMic ; raMjc
!C (raMic ; raMjc )
where rci and rcj are the categories of products rpi and rpj ,
respectively.</p>
      <p>Average Informativeness over the most relevant
product attributes (AIMPA): this feature works in a similar
way as MIMPA. When the most relevant attributes of the</p>
      <p>Mp and raMjp , are identi ed, all paths connecting
products, rai
these resources are considered. Therefore, the average
informativeness Iavg is considered and not only the score of the
most informative one.</p>
      <p>AIM P A(rpi ; rpj ) = Iavg raMip ; raMjp
Average Informativeness over the most relevant
category attributes (AIMCA): in the same way as MIMCA,
we consider here the attributes which characterize the
category the most, but we use Iavg instead of Imax, i.e. we
consider the average informativeness of all paths connecting
them.</p>
      <p>AIM CA(rpi ; rpj ) = Iavg raMic ; raMjc
Average of maximal informativeness over all attributes
(AMIO): rather than considering the most relevant
attribute for each product, we focus here on the
informativeness owing from one product (from all of its attributes) to
the other one. To do so we consider all pairs of attributes rai
and raj , which belong to rpi and rpj , respectively, and
compute the maximum informativeness carried by the links of
each product pair. As for previous features we also combine
the relevance of individual attributes with the
informativeness of the link.</p>
      <p>Let P(rpi ;rpj ) = f(rai1 ; raj1 ); (rai1 ; raj2 ); :::; (raik ; rajl )g be
a set of attributes tuples, where the rst element is an
attribute of rpi and the second of rpj . Then:</p>
      <p>AM IOP (rpi ; rpj ) =
j P(rpi ;rpj ) j
1
1
!P (rai ; raj ) =
,
!C (rai ; raj ) =
s</p>
      <p>2
CAF-ICF(rai ;rci ) + CAF-ICF(raj ;rcj )
2
Average of average informativeness over all attributes
(AAIO): this feature is computed in the same way as AMIO,
except that Iavg is used instead of Imax. In this way we
consider the complete ow of informativeness between each pair
of attributes.</p>
      <p>Maximal relevance of common product attributes
(MRCPA): when considering only the common attributes
of both products Arpi T Arpj , we compute the PAF-IPF of
the elements of the intersection and return the maximum
value.</p>
      <p>M RCP A(rpi ; rpj ) = max f !P (rai ; raj ) :</p>
      <p>rai = raj 2 (Arpi \ Arpj )g
Maximal relevance of common category attributes
(MRCCA): this is the symmetrical counterpart of
MRCPA in which one considers the contribution of common
attributes in characterizing the product categories.</p>
      <p>M RCCA(rpi ; rpj ) = max f !C (rai ; raj ) :</p>
      <p>rai = raj 2 (Arpi \ Arpj )g
Main product relevance (MPR): this metric measures
the relevance of the main product ( rst element of the pair
at the input sample) on the basis of its attributes. Let
j(ra; ?p; ?o)j be the number of triples having ra as subject
and j(?s; ?p; ra)j be the number of triples having ra as object.
Then:</p>
      <p>M P R(rpi ) =</p>
      <p>(j(ra; ?p; ?o)j + j(?s; ?p; ra)j)</p>
      <p>X
ra2Arpi
Related product relevance (RPR): this metric is the
same as the previous one, but applied to the related product
(second element of the pair at the input sample).</p>
      <p>Interconnection between products (IP ): this feature
aims at re ecting the degree of connectivity between the
attributes of two products. Let ' be the number of attributes
pairs (rai ; raj ) 2 P(rpi ;rpj ) in which there exist at least one
path connecting them. Then:</p>
      <p>IP (rpi ; rpj ) =</p>
      <p>'
jP(rpi ;rpj )j
Product similarity (PS ): to measure the degree of
similarity, we use the Jaccard similarity to the set of attributes
of both products:</p>
      <p>P S(rpi ; rpj ) =
jArpi T Arpj j
jArpi S Arpj j
Other metrics: We designed and tried several other
features whose contribution was small or minimal.</p>
      <p>In this section we introduced the features which played an
important role in predicting whether one product is
complementary to another one. All the 12 features are the result</p>
      <sec id="sec-6-1">
        <title>Imax(rai ; raj ) !P (rai ; raj )AC</title>
      </sec>
      <sec id="sec-6-2">
        <title>Imax(rai ; raj ) !C (rai ; raj CA</title>
        <p>1
1
AM IOC (rpi ; rpj ) =
j P(rpi ;rpj ) j
AM IO(rpi ; rpj ) =
2
AMIO combines both, the characterization power of the
attributes in the role of product attributes and category
attributes.</p>
        <p>AM IOP (rpi ; rpj ) + AM IOC (rpi ; rpj )
of a feature selection task performed to boost the accuracy
prediction of the classi ers. These are shown on top of the
next page together with their relative importance7.
3.4</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Classification model</title>
      <p>With enough input samples and a well-de ned feature set
the last missing step consists of tting a classi er. A wide
range of classi cation techniques exist for supervised
learning. We compared the performance of di erent kinds of
classi ers, ensemble and non, and random forest was the ttest
model. Moreover, we validated our model and carried out
hyper-parameter optimization. More details can be found
in the next section.</p>
    </sec>
    <sec id="sec-8">
      <title>EXPERIMENTS</title>
      <p>We conducted experiments to assess the performance of
CompLoD. Some of these experiments helped us to choose
the most suitable classi er and to optimize it. But the
central question, whether the designed feature set was really
signi cant, could be also answered. Each of the single tasks
of CompLoD were run on a single machine with a processor
Intel(R) Xeon(R) CPU E3-1231@3.80GHz with 4 cores and
32 GB of RAM.</p>
      <p>
        Dataset. We used a dataset from Amazon.com8. This
contains products' meta-data for each category. Our
experiments focus on the category Electronics, which includes a
wide range of subcategories, such as Internal Hard Drives,
eBook Readers or Camera Batteries, etc. The overall
number of subcategories considered is 817 whereas the number of
products is ca. 0.5 million. In addition to the meta-data for
each product the dataset also contains four lists of related
products: also bought, also viewed, bought together, buy after
viewing. As in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] we assume that also bought and bought
together products are complementary. Products from the
remaining lists are substitutable products9. We use them
as the negative class (non-complementary). We manually
veri ed that this assumption is not always correct: for some
products some related products appeared in both groups.
      </p>
      <p>Assuming that a product cannot be both, substitutable and
complementary to another one, we removed those ambiguous
related products. As a result of this process, 11% of
complementary products and 8% of non-complementary products
were removed from the dataset.</p>
      <p>Building the graph. One of the initial stages of the
framework consists of identifying from the text those tokens which
correspond to resources published in the LoD cloud.
Approx. 3500 resources were recognized by the NER tool,
which is a relatively low number. However, for 96% of the
products at least one resource could be identi ed. The
unmatched 4% was removed from the dataset. By querying
DBpedia's SPARQL remote endpoint we extract the triples
in which those resources appear as subject or object of a
triple. We repeat this process iteratively until we obtain
all resources reachable within 2 hops from the identi ed
resources. The resulting graph has approx. 1.4 million
resources.
7The importance is computed as the normalized total
reduction of the criterion represented by that feature (Gini
importance of random forests).
8The dataset is part of the project \Stanford Network
Analysis Project (SNAP)" and made available for research.
9We are not addressing in this work the task of predicting
the substitutable relationship.</p>
      <p>Computing metrics. Once the knowledge graph is built
we compute the metrics explained in section 3.2. The
PAFICF and CAF-ICF relevance scores and the informativeness
of the paths are e ciently stored in the system using
inverted indexes for easy retrieval.</p>
      <p>Classi cation. With the metrics computed, we compute
the features as explained in section 3.3 for those pairs of
products which appear in the ground truth. We store the
feature vectors and attach the class label to it,
complementary and non-complementary. The overall number of
input samples is ca. 3.7 million, of which 1.4 million
represents complementary pairs and 2.3 million pairs the
noncomplementary counterpart. These input samples were then
split into training and test sets. We kept the distribution
at 50% of positive and 50% negative samples. We started
with a relatively small number of samples for the training
set (10%) and we gradually increased it until we reached
90%. In this way we were able to analyze the learning rate
of the classi ers. We tried the performance of several of
them, such as Decision Trees, Random Forest, AdaBoost,
Naive Bayes (Gaussian), Linear Discriminant Analysis and
Quadratic Discriminant Analysis. We used the following
rates computed from the confusion matrix (TP =true
positive, FP =false positive, TN =true negative, FN =false
negative):
Precision(+) =</p>
      <p>Recall(+) =</p>
      <p>T P
T P + F P</p>
      <p>T P
T P + F N
;
;</p>
      <p>Precision(-) =
Recall(-) =</p>
      <p>T N
T N + F N</p>
      <p>T N</p>
      <p>T N + F P</p>
      <p>Accuracy =
F1-Measure(+) = 2</p>
      <p>T P + T N
T P + T N + F P + F N</p>
      <p>Precision(+) Recall(+)</p>
      <p>Precision(+) Recall(+)
The label (+) in the metric focuses on the quality attribute
for predicting the positive class whereas (-) focuses on the
negative one. Accuracy is then de ned as the mean of the
positive and negative precision score. The performance of
the di erent classi ers was similar. To avoid over tting
and make sure that the model generalizes to an
independent dataset, we validate our model with K-Fold and Monte
Carlo cross-validation (CV). While k-Fold CV makes sure
that each of the folds is used at least once as a test set,
Monte Carlo CV randomly shu es the whole dataset, thus
generating independent partitions for each run. The nal
results are shown in Figure 4(A), which illustrates that the
Random Forest classi er improves its accuracy with more
input samples from which it can learn, being able to obtain
up to ca. 78.5% accuracy. In the same gure (B) shows that
the recall for the negative class overtakes that of the
positive class. It is important to mention that a TN refers to a
non-complementary ground truth pair which is predicted to
be non-complementary. As the very last task we performed
hyper-parameter optimization by exploiting two techniques,
namely Random Search (run with 30 iterations) and Grid
Search (run with the top-3 parameter set).</p>
      <p>Results and re ections. One of the unique advantages
of our framework is that new products have a chance to be
classi ed as complementary to other products even in the
absence of previous purchase history of the
product. The results are certainly not as good as for the case
for which transaction data or post-transaction feedback is
75
70
65
positive precision
negative precision
accuracy precision
80
75
70
65
60
75
70
65
positive recall
negative recall
positive F1-measure
negative F1-measure</p>
      <p>
        Feature set for pairs of product resources (rpi ; rpj ) and their feature importance.
available, but this work should open a new door for
further contributions. For instance, the work of McAuley et
al. in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], which uses post-transaction feedback to learn a
model, obtained an accuracy of 88% on the same dataset
and category using reviews and additional meta-data which
we didn't consider in our work, e.g. prices10. The
performance of CompLoD is then surprisingly good at taking into
account that only minimal information about the product is
used, therefore making it well-suited to deal with the
coldstart problem.
      </p>
      <p>CompLoD also enables a di erent perspective. Whenever
the relationship between two products is identi ed as
complementary and an explanation is required, it is possible to
go back to the graph components and the values of the
computed graph metrics from the values of the single features
and to provide the user with a reasonable explanation. For
instance, one could use the most relevant attributes or
category attributes of both products, e.g. those that maximize
PAF-IPF or CAF-ICF, respectively, and the path connecting
those attributes which maximizes the informativeness.
Figure 5 illustrates this. All these components are a valuable
source of explanation. However, assessing which of these
graph components are the most useful would require an
online study with real users.</p>
      <p>Computational cost and scalability. PAF-IPF and
CAFICF have a cost of O(jP j+jAj) and O(jCj+jAj), respectively.</p>
      <p>The cost of path informativeness is higher: O(jIj + jP redj),
where P red is the set of all vocabulary predicates. While
it is obvious that extracting information from a large graph
and computing metrics over it is certainly expensive, most
of the tasks done in the single stages of the framework can
be pre-computed o -line. For instance, one could try to keep
as many attributes in the graph as possible or even try to
anticipate attributes of soon-to-be-released products. The
10The results are not comparable only in terms of the
learning source but also on the fact that they only considered in
their experiments products with 20 reviews or more, which
means that only 20% of the pairs of products contained in
the original ground truth remain after ltering those
products out.
metrics can be then computed and stored at once. In this
way when a new product arrives in the system, all of its
attributes will already be present in the graph, together with
the informativeness scores of the semantic paths connecting
them to other attributes.
5.</p>
    </sec>
    <sec id="sec-9">
      <title>RELATED WORK</title>
      <p>
        Approaches to nd complementary products have a wide
range of applications, from product placement to products
bundling to website structure optimization, etc. In
particular, on-line retailers make use of complementary products
to provide recommendations [
        <xref ref-type="bibr" rid="ref30 ref35">30, 35</xref>
        ], which help the users
to discover related products while improving their volume of
sales. Most of these techniques typically consist of mining
transactional purchase data, event logs, or users' implicit
feedback. However, for new products for which such
information is not available or scarce, most of these approaches
are not applicable or fail altogether.
      </p>
      <p>
        As in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], we aim at learning the semantics of this
relationship, thus making it possible to disregard transactions.
Therefore, we build a knowledge graph in which products,
attributes, and categories are represented as nodes
interconnected by semantic paths. Knowledge acquisition is in
general a very costly task. Therefore, we extract this
information from DBpedia, whose usage as a knowledge base has
been validated in other elds, such as that of Cross-domain
Recommender Systems [
        <xref ref-type="bibr" rid="ref23 ref29 ref6 ref9">6, 9, 23, 29</xref>
        ].
      </p>
      <p>
        Being able to learn from the semantics encoded in the
graph and the patterns therein has some commonalities with
Link Discovery or Link prediction. These approaches see
a knowledge graph as a statistical model. The existence
of triples is represented in these models as binary random
variables which are correlated with each other. Di erent
assumptions for the correlation lead to di erent models, e.g.
based on latent features [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], graph features or those in
which variables have local interactions [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Our work is
closer to observable graph feature models, in the sense that
some of the problems addressed there, e.g. designing a
feature set from a graph to t a learning model, are similar. For
instance, [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] focuses on predicting future friendship links in
Social Networks using topological features. Aiming at the
same goal, [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] presents and uses a combination of a large
number of features. These techniques are not necessarily
limited to Social Networks, but they are used in elds like
Biomedicine [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] or Genomics [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ].
      </p>
      <p>However, there is a central di erence between these
approaches and ours. First, we deal here with a knowledge
graph in which the structured information is integrated into
a complex ontology. This means two nodes can be
interconnected in several ways which are not necessarily known
in advance. This di ers from the kinds of graphs typically
treated in the link prediction problem, e.g. social, concept,
or interaction networks, whose edges intend rather to
represent a small number of relationships and are therefore less
exible in the representation. Secondly, many of the
features proposed are based on local similarity indices, such as
Common Neighbors, Adamic-Adar Index, Preferential
Attachment Index, Jaccard Index, etc. and might therefore be
too localized to capture the patterns which determine the
complementary relationship.</p>
      <p>
        A related problem can also be found in the eld of online
search behavior [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. In this area some e orts have been
done to predict whether two queries posed by a user within
a single or multiple sessions aim at performing the same
task. To do so, some approaches try to model hierarchies
of tasks and subtasks. For example, booking a hotel and
registering for a conference venue are subtasks of planning a
conference attendance. These methods are typically based
on classi cation, as in our case, or clustering [
        <xref ref-type="bibr" rid="ref11 ref12 ref17 ref33">11, 12, 17,
33</xref>
        ]. Since a purchase of one product and a complementary
one can be thought of as the problem of matching tasks, we
consider this line of research related work.
      </p>
    </sec>
    <sec id="sec-10">
      <title>CONCLUSION AND FUTURE WORK</title>
      <p>We presented CompLoD, a framework that leverages
different models and techniques to identify complementary
products while only using a product's available meta-data, such
as titles, descriptions and categories. By extracting
structured information from the text, we manage to build a
knowledge graph where products' attributes are interconnected by
semantic paths. Being able to identify complementary
products from this graph rather than using transactions, as most
of the current techniques do, solves the cold-start problem
for new items. Moreover, the fact that this graph is based on
DBpedia demonstrates the usefulness and value of Semantic
Web standards. The prediction task in the last stage is
implemented with a Random Forest Classi er which predicts,
for a pair of products, whether one product is
complementary to the other. However, such a model requires a feature
set which represents properties of pairs of products. To the
best of our knowledge, this is the rst time where the
problem of extracting this information from a semantic graph to
predict the complementary relationship is addressed. Our
experiments show that the classi er is able to learn the
relationship from our designed feature set, thus validating it.</p>
      <p>Although the predictive power of our system is
surprisingly good, given that this is achieved only by using
products' meta-data, it is important to mention that the quality
of CompLoD depends on the quality of the models and tools
used in the single stages. This dependency consequently
leaves room for improvement. Therefore, we would like to
try di erent NER tools and LoD datasets and investigate the
qualitative impact. We could also exploit a more
domainspeci c LoD dataset. For this reason, the framework is
extensible, i.e. all components can be extended or replaced.</p>
      <p>In the future we would like to conduct further experiments
considering other categories, to strengthen the signi cance
of our approach. Predicting whether a product is
complementary to another one might still produce too many
recommendation candidates for a single user. Therefore, we would
like to extend our model to be able to further re ne the list
of complementary products taking into account the user's
taste. For this task, a ranking strategy might be more
suitable than a classi cation one. We would also like to include
features that re ect the hierarchical structure of categories
and in general to continue searching for new features that
improve the accuracy of the system.
7.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          , A. ElKorany, and
          <string-name>
            <given-names>R.</given-names>
            <surname>Bahgat</surname>
          </string-name>
          .
          <article-title>A supervised learning approach to link prediction in twitter</article-title>
          .
          <source>Social Network Analysis and Mining</source>
          ,
          <volume>6</volume>
          (
          <issue>1</issue>
          ):
          <fpage>24</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Aribarg</surname>
          </string-name>
          and
          <string-name>
            <given-names>N. Z.</given-names>
            <surname>Foutz</surname>
          </string-name>
          .
          <article-title>Category-based screening in choice of complementary products</article-title>
          .
          <source>Journal of Marketing Research</source>
          ,
          <volume>46</volume>
          (
          <issue>4</issue>
          ):
          <volume>518</volume>
          {
          <fpage>530</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Azaria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hassidim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kraus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Eshkol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Weintraub</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Netanely.</surname>
          </string-name>
          <article-title>Movie recommender system for pro t maximization</article-title>
          .
          <source>In 7th ACM Conf. on Recommender Systems</source>
          , RecSys '13,
          <string-name>
            <surname>Hong</surname>
            <given-names>Kong</given-names>
          </string-name>
          , China, pages
          <volume>121</volume>
          {
          <fpage>128</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>I.</given-names>
            <surname>Cantador</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <article-title>Fernandez-Tob as, S. Berkovsky, and</article-title>
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          .
          <article-title>Cross-domain recommender systems</article-title>
          .
          <source>In Recommender Systems Handbook</source>
          , pages
          <volume>919</volume>
          {
          <fpage>959</fpage>
          .
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W.</given-names>
            <surname>Cukierski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hamner</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <article-title>Graph-based features for supervised link prediction</article-title>
          .
          <source>In Neural Networks (IJCNN)</source>
          ,
          <source>Int. Joint Conf. on</source>
          , pages
          <volume>1237</volume>
          {
          <fpage>1244</fpage>
          ,
          <year>July 2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>I.</given-names>
            <surname>Fernandez-Tob as</surname>
          </string-name>
          , I. Cantador,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kaminskas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Ricci</surname>
          </string-name>
          .
          <article-title>A generic semantic-based framework for cross-domain recommendation</article-title>
          .
          <source>In Proc. of the 2Nd Int. Workshop on Information Heterogeneity and Fusion in Recommender Systems, HetRec '11</source>
          , pages
          <fpage>25</fpage>
          {
          <fpage>32</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Gedikli</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          .
          <article-title>Neighborhood-restricted mining and weighted application of association rules for recommenders</article-title>
          .
          <source>In Proc. of the 11th Int. Conf. on Web Information Systems Engineering</source>
          , WISE'
          <volume>10</volume>
          , pages
          <fpage>157</fpage>
          {
          <fpage>165</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Heuss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Humm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Henninger</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Rippl</surname>
          </string-name>
          .
          <article-title>A comparison of ner tools w</article-title>
          .r.t.
          <article-title>a domain-speci c vocabulary</article-title>
          .
          <source>In Proc. of the 10th Int. Conf. on Semantic Systems, SEM '14</source>
          , pages
          <fpage>100</fpage>
          {
          <fpage>107</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kaminskas</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <article-title>Fernandez-Tob as, I. Cantador, and</article-title>
          <string-name>
            <given-names>F.</given-names>
            <surname>Ricci</surname>
          </string-name>
          .
          <article-title>Ontology-Based Identi cation of Music for Places</article-title>
          , pages
          <volume>436</volume>
          {
          <fpage>447</fpage>
          .
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Katukuri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. V.</given-names>
            <surname>Raghavan</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          .
          <article-title>Supervised link discovery on large-scale biomedical concept networks</article-title>
          .
          <source>In Proc. of the 2011 IEEE Int. Conf. on Bioinformatics and Biomedicine</source>
          ,
          <source>BIBM '11</source>
          , pages
          <fpage>562</fpage>
          {
          <fpage>568</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Zha</surname>
          </string-name>
          .
          <article-title>Identifying and labeling search tasks via query-based hawkes processes</article-title>
          .
          <source>In Proc. of the 20th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, KDD '14</source>
          , pages
          <fpage>731</fpage>
          {
          <fpage>740</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Huang</surname>
          </string-name>
          , L. w. He, and
          <string-name>
            <given-names>Q.</given-names>
            <surname>He</surname>
          </string-name>
          .
          <article-title>Task trail: An e ective segmentation of user search behavior</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          ,
          <volume>26</volume>
          (
          <issue>12</issue>
          ):
          <volume>3090</volume>
          {
          <fpage>3102</fpage>
          ,
          <string-name>
            <surname>Dec</surname>
          </string-name>
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Manola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Miller</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B. McBride. RDF</given-names>
            <surname>Primer. W3C Recom</surname>
          </string-name>
          .,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>J. J. McAuley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pandey</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Leskovec</surname>
          </string-name>
          .
          <article-title>Inferring networks of substitutable and complementary products</article-title>
          .
          <source>In Proc. of the 21th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining</source>
          , Sydney,
          <string-name>
            <surname>NSW</surname>
          </string-name>
          , Australia,
          <year>2015</year>
          , pages
          <fpage>785</fpage>
          {
          <fpage>794</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>J. J. McAuley</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Targett</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          <string-name>
            <surname>Shi</surname>
          </string-name>
          , and A. van den Hengel.
          <article-title>Image-based recommendations on styles and substitutes</article-title>
          .
          <source>In Proc. of the 38th Int. ACM SIGIR Conf. on Research and Development in Information Retrieval</source>
          , Santiago, Chile,
          <year>2015</year>
          , pages
          <fpage>43</fpage>
          {
          <fpage>52</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>R.</given-names>
            <surname>Mehrotra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Yilmaz</surname>
          </string-name>
          .
          <article-title>Uncovering task based behavioral heterogeneities in online search behavior</article-title>
          .
          <source>In Proc. of the 39th Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, SIGIR '16</source>
          , pages
          <fpage>1049</fpage>
          {
          <fpage>1052</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>R.</given-names>
            <surname>Mehrotra</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Yilmaz</surname>
          </string-name>
          .
          <article-title>Towards hierarchies of search tasks &amp; subtasks</article-title>
          .
          <source>In Proc. of the 24th Int. Conf. on World Wide Web, WWW '15 Companion</source>
          , pages
          <volume>73</volume>
          {
          <fpage>74</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Meymandpour</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Davis</surname>
          </string-name>
          .
          <article-title>Recommendations using linked data</article-title>
          .
          <source>In Proc. of the 5th Ph.D. Workshop on Information and Knowledge Management</source>
          ,
          <string-name>
            <surname>PIKM</surname>
          </string-name>
          <year>2012</year>
          ,
          <article-title>Maui</article-title>
          ,
          <string-name>
            <surname>HI</surname>
          </string-name>
          , USA,
          <year>2012</year>
          , pages
          <fpage>75</fpage>
          {
          <fpage>82</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>I. C.</given-names>
            <surname>Mogotsi</surname>
          </string-name>
          . Christopher d. manning, prabhakar raghavan, and hinrich schutze: Introduction to information retrieval - cambridge university press, cambridge, england,
          <year>2008</year>
          , 482 pp,
          <source>ISBN: 978-0-521-86571-5</source>
          . Inf. Retr.,
          <volume>13</volume>
          (
          <issue>2</issue>
          ):
          <volume>192</volume>
          {
          <fpage>195</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nickel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Murphy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tresp</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Gabrilovich</surname>
          </string-name>
          .
          <article-title>A review of relational machine learning for knowledge graphs</article-title>
          .
          <source>Proc. of the IEEE</source>
          ,
          <volume>104</volume>
          (
          <issue>1</issue>
          ):
          <volume>11</volume>
          {
          <fpage>33</fpage>
          ,
          <string-name>
            <surname>Jan</surname>
          </string-name>
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nickel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tresp</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.-P.</given-names>
            <surname>Kriegel</surname>
          </string-name>
          .
          <article-title>Factorizing yago: Scalable machine learning for linked data</article-title>
          .
          <source>In Proc. of the 21st Int. Conf. on World Wide Web, WWW '12</source>
          , pages
          <fpage>271</fpage>
          {
          <fpage>280</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>D.</given-names>
            <surname>Nikovski</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.</given-names>
            <surname>Kulev</surname>
          </string-name>
          .
          <article-title>Induction of compact decision trees for personalized recommendation</article-title>
          .
          <source>In Proc. of the 2006 ACM Symposium on Applied Computing, SAC '06</source>
          , pages
          <fpage>575</fpage>
          {
          <fpage>581</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>V. C.</given-names>
            <surname>Ostuni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. Di</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Di</given-names>
            <surname>Sciascio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Oramas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>X.</given-names>
            <surname>Serra</surname>
          </string-name>
          .
          <article-title>A semantic hybrid approach for sound recommendation</article-title>
          .
          <source>In Proc. of the 24th Int. Conf. on World Wide Web, WWW '15 Companion</source>
          , pages
          <volume>85</volume>
          {
          <fpage>86</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>B.</given-names>
            <surname>Pathak</surname>
          </string-name>
          , R. Gar nkel, R. D.
          <string-name>
            <surname>Gopal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Venkatesan</surname>
            , and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Yin</surname>
          </string-name>
          .
          <article-title>Empirical analysis of the impact of recommender systems on sales</article-title>
          .
          <source>Journal of Management Information Systems</source>
          ,
          <volume>27</volume>
          (
          <issue>2</issue>
          ):
          <volume>159</volume>
          {
          <fpage>188</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>J.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Arenas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          .
          <article-title>Semantics and complexity of SPARQL</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .,
          <volume>34</volume>
          (
          <issue>3</issue>
          ),
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>G.</given-names>
            <surname>Pirro</surname>
          </string-name>
          . Reword:
          <article-title>Semantic relatedness in the web of data</article-title>
          .
          <source>In Proc. of the 26th AAAI Conf. on Arti cial Intelligence</source>
          ,
          <year>2012</year>
          , Toronto, Ontario, Canada.,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>T.</given-names>
            <surname>Raeder</surname>
          </string-name>
          and
          <string-name>
            <given-names>N. V.</given-names>
            <surname>Chawla</surname>
          </string-name>
          .
          <article-title>Market basket analysis with networks</article-title>
          .
          <source>Social Network Analysis and Mining</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          ):
          <volume>97</volume>
          {
          <fpage>113</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>F.</given-names>
            <surname>Ricci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rokach</surname>
          </string-name>
          , and B. Shapira, editors.
          <source>Recommender Systems Handbook</source>
          .
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ristoski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schuhmacher</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Paulheim</surname>
          </string-name>
          .
          <article-title>Using graph metrics for linked open data enabled recommender systems</article-title>
          .
          <source>In E-Commerce and Web Technologies - 16th Int. Conf. on Electronic Commerce and Web Technologies</source>
          ,
          <string-name>
            <surname>EC-Web</surname>
            <given-names>2015</given-names>
          </string-name>
          , Valencia, Spain,
          <year>2015</year>
          , Revised Selected Papers, pages
          <volume>30</volume>
          {
          <fpage>41</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Hofman</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Watts</surname>
          </string-name>
          .
          <article-title>Estimating the causal impact of recommendation systems from observational data</article-title>
          .
          <source>In Proc. of the 16th ACM Conf. on Economics and Computation</source>
          , EC '
          <volume>15</volume>
          , pages
          <fpage>453</fpage>
          {
          <fpage>470</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>G.</given-names>
            <surname>Suchacka</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Chodak</surname>
          </string-name>
          .
          <article-title>Using association rules to assess purchase probability in online stores</article-title>
          .
          <source>Information Systems and e-Business Management</source>
          , pages
          <volume>1</volume>
          {
          <fpage>30</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>I. M. A. O.</given-names>
            <surname>Swesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Bakar</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. S. A.</given-names>
            <surname>Kadir</surname>
          </string-name>
          .
          <article-title>Mining positive and negative association rules from interesting frequent and infrequent itemsets</article-title>
          .
          <source>In Fuzzy Systems and Knowledge Discovery (FSKD)</source>
          ,
          <source>2012 9th Int. Conf. on</source>
          , pages
          <volume>650</volume>
          {
          <fpage>655</fpage>
          , May
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>White</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Chu</surname>
          </string-name>
          .
          <article-title>Learning to extract cross-session search tasks</article-title>
          .
          <source>In Proc. of the 22Nd Int. Conf. on World Wide Web, WWW '13</source>
          , pages
          <fpage>1353</fpage>
          {
          <fpage>1364</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>Z.</given-names>
            <surname>You</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          .
          <article-title>Integration of genomic and proteomic data to predict synthetic genetic interactions using semi-supervised learning</article-title>
          .
          <source>In Proc. of the 5th Int. Conf. on Emerging Intelligent Computing Technology and Applications</source>
          , ICIC '
          <volume>09</volume>
          , pages
          <fpage>635</fpage>
          {
          <fpage>644</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Niu</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Bolivar</surname>
          </string-name>
          .
          <article-title>Substitutes or complements: Another step forward in recommendations</article-title>
          .
          <source>In Proc. of the 10th ACM Conf. on Electronic Commerce, EC '09</source>
          , pages
          <fpage>139</fpage>
          {
          <fpage>146</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>