<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Joint Modeling and Optimization of Search and Joint Modeling and Optimization of Search and Recommendation Recommendation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hamed Zamani Hamed Zamani</string-name>
          <email>zamani@cs.umass.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>W. Bruce Croft W. Bruce Croft</string-name>
          <email>croft@cs.umass.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Search Result List</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Amherst</institution>
          ,
          <addr-line>MA 01003 Amherst, MA 01003</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Center for Intelligent Information Retrieval Center for Intelligent Information Retrieval</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Massachusetts Amherst University of Massachusetts Amherst</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Despite the somewhat diferent techniques used in developing Despite the somewhat di erent techniques used in developing search engines and recommender systems, they both follow the search engines and recommender systems, they both follow the same goal: helping people to get the information they need at the same goal: helping people to get the information they need at the right time. Due to this common goal, search and recommendation right time. Due to this common goal, search and recommendation models can potentially benefit from each other. The recent advances models can potentially bene t from each other. The recent advances in neural network technologies make them efective and easily exin neural network technologies make them e ective and easily extendable for various tasks, including retrieval and recommendation. tendable for various tasks, including retrieval and recommendation.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        A quarter century has passed since Belkin and Croft [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] discussed
of Search and Recommendation. In Proceedings of Design of Experimental
thSeesairmchil&amp;arIintfyoramnadtiuonniRqEuteriecvhaallSleynstgemess o( DfEinSfIRoErmS2a0t1io8)n. AreCtMrie,Nvaelw(IYRo)rk,
anNdYin,UfoSrAm, a7tpioagnefilst.ehrtintpgs:(/I/Fd)osi.yosrgte/1m0.s1.1T4h5/enyncnonncnlnu.dnendnnthnnanttheir
underlying goals are essentially equivalent, and thus they are two sides
of1the sIaNmTe RcoOinD.TUhCis TisIwOhNycontent-based filtering approaches,
especially those deal with unstructured data, employ several
tech
      </p>
      <p>
        A quarter century has passed since Belkin and Croft [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] discussed
niques initially developed for IR tasks, e.g., see [
        <xref ref-type="bibr" rid="ref13 ref14 ref20 ref30">13, 14, 20, 30</xref>
        ]. With
the similarity and unique challenges of information retrieval (IR)
the growth of collaborative filtering approaches, IR and
recomand information ltering (IF) systems. They concluded that their
unmender system (RecSys) have become two separate fields with a
derlying goals are essentially equivalent, and thus they are two sides
little overlap between the two communities. Nevertheless, IR
modof the same coin. This is why content-based ltering approaches,
els and evaluation methodologies are still common in recommender
especially those deal with unstructured data, employ several
techsystems. For instance, common IR evaluation metrics such as mean
niques initially developed for IR tasks, e.g., see [
        <xref ref-type="bibr" rid="ref13 ref14 ref20 ref30">13, 14, 20, 30</xref>
        ]. With
average precision (MAP) and normalized discounted cumulative
gaPinerm(NisDsioCnGto) [m9a]kaerdeigfirtealqourehnatrldycoupsieedsobfypatrhteorRaellcoSfytshicsowmormk ufonripteyrs[o2n2a]l.or
classroom use is granted without fee provided that copies are not made or distributed
IR models such as learning to rank approaches are also popular in
for pro t or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
      </p>
      <p>For all other uses, contact the owner/author(s).</p>
    </sec>
    <sec id="sec-2">
      <title>Recommendation</title>
    </sec>
    <sec id="sec-3">
      <title>Engine</title>
    </sec>
    <sec id="sec-4">
      <title>Search Engine</title>
      <p>search query</p>
    </sec>
    <sec id="sec-5">
      <title>Users</title>
    </sec>
    <sec id="sec-6">
      <title>Items</title>
      <p>
        representations.
representations.
tion) and recommendation systems where items are shared,
tion) and recommendation systems where items are shared,
e.g., in e-commerce websites. The intuition behind joint
e.g., in e-commerce websites. The intuition behind joint
modeling of search and recommendation is making use of
modeling of search and recommendation is making use of
training data from both sides to learn more accurate item
training data from both sides to learn more accurate item
the RecSys literature [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Costa and Roda [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] formulated
recomthe growth of collaborative ltering approaches, IR and
recommender systems as an IR task. The language modeling framework
mender system (RecSys) have become two separate elds with a
for information retrieval [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and relevance models [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] have been
little overlap between the two communities. Nevertheless, IR
modalso adapted for the collaborative filtering task [
        <xref ref-type="bibr" rid="ref17 ref24 ref25">17, 24, 25</xref>
        ]. On the
els and evaluation methodologies are still common in recommender
other hand, RecSys techniques have been also used in a number
systems. For instance, common IR evaluation metrics such as mean
of IR tasks. For instance, Zamani et al. [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] cast the query
expanaverage precision (MAP) and normalized discounted cumulative
sion task to a recommendation problem, and used a collaborative
gain (NDCG) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] are frequently used by the RecSys community [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
ifltering approach to design a pseudo-relevance feedback model.
      </p>
      <p>
        IR models such as learning to rank approaches are also popular in
In this paper, we revisit the Belkin and Croft’s insights to relate
the RecSys literature [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Costa and Roda [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] formulated
recomthese two fields once again. We believe that search engines and
mender systems as an IR task. The language modeling framework
recommender systems seek the same goal:
for information retrieval [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] and relevance models [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] have been
also adapted for the collaborative ltering task [
        <xref ref-type="bibr" rid="ref17 ref24 ref25">17, 24, 25</xref>
        ]. On the
Helping people get the information they need at the right time.
other hand, RecSys techniques have been also used in a number
Therefore, from an abstract point of view, joint modeling and
optiof IR tasks. For instance, Zamani et al. [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] cast the query
expanmization of search engines and recommender systems, if possible,
sion task to a recommendation problem, and used a collaborative
could potentially benefit both systems. Successful implementation
ltering approach to design a pseudo-relevance feedback model.
of such joint modeling could close the gap between the IR and
      </p>
      <p>In this paper, we revisit the Belkin and Croft’s insights to relate
RecSys communities. Moreover, joint optimization of search and
these two elds once again. We believe that search engines and
recommendation is an interesting and feasible direction from the
recommender systems seek the same goal:
application point of view. For example, in e-commerce websites,
such as Amazon1 and eBay2, users use the search functionality to
ifnd the products relevant to their information needs, and the
recommendation engine recommends them the products that are likely to
address their needs. This makes both search and recommendation
the two major components in e-commerce websites. As depicted
in Figure 1, they share the same set of products (and potentially
users in case of personalized search), and thus the user interactions
with both search engine and recommender system can be used to
improve the performance in both retrieval and recommendation.
Note that this is not only limited to the e-commerce websites; any
service that provides both search and recommendation
functionalities can benefit from such joint modeling and optimization. This
includes media streaming services, such as Netflix and Spotify,
media sharing services, such as YouTube, academic publishers, and
news agencies.</p>
      <p>
        Deep learning approaches have recently shown state-of-the-art
performance in various retrieval [
        <xref ref-type="bibr" rid="ref16 ref29 ref5 ref6">5, 6, 16, 29</xref>
        ] and recommendation
tasks [
        <xref ref-type="bibr" rid="ref2 ref8">2, 8</xref>
        ]. Recently, Ai et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and Zhang et al. [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] showed
that using multiple sources of information is useful in both product
search and recommendation, which was made possible by neural
models in both applications. These neural retrieval and
recommendation models can be combined and trained jointly, which is the
focus of this paper. We propose a general framework, called JSR,3 to
jointly model and train search engines and recommender systems.
As the first step towards implementing the JSR framework, we use
simple fully-connected neural networks to investigate the promise
of such joint modeling. We evaluate our models using Amazon’s
product dataset. Our experiments suggest that joint modeling can
lead to substantial improvements in both retrieval and
recommendation performance, compared to the models trained separately. We
show that joint modeling can also lead to higher generalization by
preventing the model to overfit on the training data. The observed
substantial improvements suggest this research direction as a new
promising avenue in the IR and RecSys literature. We finish by
describing potential outcomes for this research direction.
2
      </p>
      <sec id="sec-6-1">
        <title>THE JOINT SEARCH-RECOMMENDATION</title>
      </sec>
      <sec id="sec-6-2">
        <title>FRAMEWORK</title>
        <p>In this section, we describe our simple framework for joint
modeling and optimization of search engines and recommender systems,
called JSR. The purpose of JSR is to take advantage of both search
and recommendation training data in order to improve the
performance in both tasks. This can be achieved by learning joint
representations and simultaneous optimization. In the following
subsections, we simplify and formalize the task and further
introduce the JSR framework.
2.1</p>
      </sec>
      <sec id="sec-6-3">
        <title>Problem Statement</title>
        <p>Given a set of retrieval training data (e.g., a set of relevant and
non-relevant query-item pairs) and a set of recommendation
training data (e.g., a set of user-item-rating triples), the task is to train
a retrieval model and a recommender system, jointly. Formally,
1https://www.amazon.com/
2https://www.ebay.com/
3JSR stands for the joint search and recommendation framework.
assume that I = {i1, i2, · · · , ik } is a set of k items. Let DI R =
{(q1, R1, R1), (q2, R2, R2), · · · , (qn , Rn , Rn )} be a set of retrieval data,
where Ri ⊆ I and Ri ⊆ I respectively denote the set of relevant and
non-relevant items for the query qi . Hence, Ri ∩ Ri = ∅. Also, let
DRS = {(u1, I1), (u2, I2), · · · , (um , Im )} be a set of recommendation
data where Ii ⊆ I denotes the set of items favored (e.g., purchased)
by the user ui .4 Assume that DI R is split to two disjoint subsets
DtI Rrain and DtI Rest by query, i.e., there is no query overlap between
these two subsets. Also, assume that DRS is split to two disjoint
subsets Dt r ain and DtReSst , such that both subsets include all users</p>
        <p>RS
and DtRrSain contains a random subset of purchased items by each
user and Dt est contains the remaining items. This means that there</p>
        <p>RS
is no user-item overlap between Dt r ain and DtReSst . Note that
alRS
though the training data for search ranking difers from the data
used for training a recommender system, they both share the same
set of items.</p>
        <p>The task is to train a retrieval model MI R and a recommendation
model MRS on the training sets DtI Rrain and Dt r ain . The models
RS
MI R and MRS will be respectively evaluated based on the retrieval
performance on the test queries in DtI Rest and the recommendation
performance based on predicting the favorite (e.g., purchased) items
for each user in the test set DtReSst . Note that MI R and MRS may
share some parameters.
2.2</p>
      </sec>
      <sec id="sec-6-4">
        <title>The JSR Framework</title>
        <p>JSR is a general framework for jointly modeling search and
recommendation and consists of two major components: a retrieval
component and a recommendation component. The retrieval
component computes the retrieval score for an item i given a query q
and a query context cq . The query context may include the user
profile, long-term search history, session information, or situational
context such as location. The recommendation component
computes a recommendation score for an item i given a user u and a
user context cu . The user context may consist of the recent user’s
activities, the user’s mood, situational context, etc. Figure 2
depicts a high-level overview of the JSR framework. Formally, the JSR
framework calculates the following two scores:
retrieval score = ψ (ϕQ (q, cq ), ϕI (i ))
recommendation score = ψ ′(ϕ U′ (u, cu ), ϕI′ (i ))
(1)
(2)
where ψ and ψ ′ are the matching functions, and ϕQ , ϕI , ϕ U′ , and ϕI′
are the representation learning functions. In the following
subsection, we describe how we implement these functions using
fullyconnected feed-forward networks. This framework can be further
implemented using more sophisticated and state-of-the-art search
and recommendation network architectures. Note that the items
are shared by both search and recommendation systems, thus they
can benefit from an underlying shared representation for each item.
For simplicity, we do not consider context in the initial framework
described here.</p>
        <p>Independent from the way each component is implemented, we
train the JSR framework by minimizing a joint loss function L that
is equal to the sum of retrieval loss and recommendation loss, as</p>
        <sec id="sec-6-4-1">
          <title>4This can be simply generalized to numeric ratings, as well.</title>
          <p>M
l
e
d
o
l
a
v
e
i
r
t
e
R

( ,  )

′</p>
          <p>Shared Item Set

′</p>
          <p>( ,  )</p>
          <p>R
e
c
o
m
m
e
n
d
a
t
i
o
n
o
d
e
l
M
training instance for the retrieval model is a query qj from DtI Rrain ,
a positive item sampled from Rj , and a negative item sampled from
Rj . LI R (b ) is a binary cross-entropy loss function (i.e., equivalent
LI R (b ) = −</p>
          <p>X log p (ij &gt; i j |qj )</p>
          <p>exp(ψ (ϕQ (qj ), ϕI (ij )))
exp(ψ (ϕQ (qj ), ϕI (ij ))) + exp(ψ (ϕQ (qj ), ϕI (i j )))</p>
          <p>The recommendation loss is also defined similarly; for each user
uj , we draw a positive sample ij from the user’s favorite items (i.e.,
Ij in Dt r ain ), and a random negative sample i j from I . LRS (b ) is</p>
          <p>RS
also defined as a binary cross-entropy loss function as follows:
LRS (b ′) = −</p>
          <p>X log p (ij &gt; i j |uj )
|b |
j=1
|b |
j=1
= −</p>
          <p>X log
= −
|b′ |
j=1
|b |
X log
j=1</p>
          <p>In summary, the search and recommendation components in the
JSR framework are modeled as two distinct functions that may share
some parameters. They are optimized via a joint loss function that
minimizes pairwise error in both retrieval and recommendation,
simultaneously.
2.3</p>
        </sec>
      </sec>
      <sec id="sec-6-5">
        <title>Implementation of JSR</title>
        <p>Since the purpose of this paper is to only show the potential
importance of joint modeling and optimization of search and
recommendation models, we simply use fully-connected feed-forward
networks to implement the components of the JSR framework. The
performance of more sophisticated search and recommendation
models will be investigated in the future. As mentioned earlier in
Section 2.2, we do not consider query and user contexts in our
experiments.</p>
        <p>We model the query representation function ϕQ as a
fully-connected network with a single hidden layer. The weighted average
of embedding vectors for individual query terms is fed to this
network. In other words, P</p>
        <p>t ∈q WL(t ) · E (t ) is the input of the query
representation network, where W : V → R maps each term in the
vocabulary set V to a global real-valued weight and E : V → Rd
maps each term to a d-dimensional embedding vector. Note that</p>
        <p>
          exp(W (t ))
the matrices W and E are optimized as part of the model at the
training time. WL(t ) is just a normalized weight computed using
a softmax function as Pt′∈q exp(W (t ′)) . This simple yet efective
bag-of-words representation has been previously used in [
          <xref ref-type="bibr" rid="ref26 ref5">5, 26</xref>
          ] for
the ad-hoc retrieval and query performance prediction tasks. The
item representation functions ϕI and ϕ ′ are also implemented
similarly. The matrices W and E are shared by all of these functions for
transferring knowledge among the retrieval and recommendation
        </p>
        <p>I
components.</p>
        <p>The user representation function ϕ ′ is simply implemented as a
look-up table that returns the corresponding row of a user
embed</p>
        <p>U
dense vector. The model learns appropriate user representations
based on the items they previously rated (or favored) in the training
data.</p>
        <p>The matching functions ψ and ψ ′ are implemented as two layer
fully-connected networks. The input of ψ is ϕQ ◦ϕI where ◦ denotes
the Hadamard product. Similarly, ϕ U′ ◦ ϕI′ is fed to the ψ ′ network.</p>
        <p>
          exp(ψ ′(ϕ U′ (uj ), ϕI′ (ij )))
exp(ψ ′(ϕ U′ (uj ), ϕI′ (ij ))) + exp(ψ ′(ϕ U′ (uj ), ϕI′ (i j ))) ding matrix U : U → Rd′ that maps each user to a d ′-dimensional
This enforces the outputs of ϕQ and ϕI as well as ϕ U′ and ϕI′ to
have equal dimensionalities. Note that both ψ and ψ ′ each returns
a single real-valued score. These matching functions are similar to
those used in [
          <xref ref-type="bibr" rid="ref16 ref29">16, 29</xref>
          ] for web search.
        </p>
        <p>In each network, we use ReLU as the activation function in the
hidden layers and sigmoid as the output activation function. We
also use dropout in all hidden layers to prevent overfitting.
3</p>
      </sec>
      <sec id="sec-6-6">
        <title>PRELIMINARY EXPERIMENTS</title>
        <p>In this section, we present a set of preliminary results that provide
insights into the advantages of jointly modeling and optimizing
search engines and recommender systems. Note that to fully
understand the value of the proposed framework, large-scale and detailed
evaluation and analysis are required and will be done in future
work.</p>
        <p>In the following, we first introduce our data for training and
evaluating both search and recommendation components. We
further review our experimental setup and evaluation metrics, which
are followed by the preliminary results and analysis.
3.1</p>
      </sec>
      <sec id="sec-6-7">
        <title>Data</title>
        <p>
          Experiment design for the search-recommendation joint
modeling task is challenging, since there is no public data available for
both tasks with a shared set of items. To evaluate our models, we
used the Amazon product dataset5 [
          <xref ref-type="bibr" rid="ref15 ref7">7, 15</xref>
          ], consisting of millions
of users and products, as well as rich meta-data information
including user reviews, product categories, and product descriptions.
The data only contains the users and items with at least five
associated reviews. In our experiments, we used three subsets of this
dataset associated with the following categories: Electronics, Kindle
Store, and Cell Phones &amp; Accessories. The first two are large-scale
datasets covering common product types, while the last one is a
small dataset suitable for evaluating the models in a scenario where
data is limited.
        </p>
        <p>Recommendation Data: In the Amazon website, users can only
submit reviews for the products that they have already purchased.
Therefore, from each review we can infer that the user who wrote
it has purchased the corresponding item. This results in a set of
purchased (user, item) pairs for constructing the set DRS (see Section
2.1) that can be used for training and evaluating a recommender
system.</p>
        <p>
          Retrieval Data: The Amazon product data does not contain
search queries, thus cannot be directly used for evaluating
retrieval models. As Rowley [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] investigated, directed product search
queries contain either a producer’s name, a brand, or a set of
terms describing the product category. Following this observation,
Van Gysel et al. [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] proposed to automatically generate queries
5http://jmcauley.ucsd.edu/data/amazon/
based on the product categories. To be exact, for each item in a
category c, a query q is generated based on the terms in the category
hierarchy of c. Then, all the items within that category are marked
as relevant for the query q. The detailed description of the query
generation process can be found in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. A set of random negative
items are also sampled as non-relevant items to construct DI R (see
Section 2.1) for training.
3.2
        </p>
      </sec>
      <sec id="sec-6-8">
        <title>Experimental Setup</title>
        <p>
          We cleaned up the data by removing non-alphanumerical
characters and stopwords from queries and reviews. Similar to previous
work [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], the content of reviews for each item i were concatenated
to represent the item.
        </p>
        <p>
          We implemented our model using TensorFlow.6 In all
experiments, the network parameters were optimized using Adam
optimizer [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Hyper-parameters were optimized using grid search
based on the loss value obtained on a validation set (the model
was trained on 90% of the training set and the remaining 10% was
used for validation). The learning rate was selected from {1E −
5, 5E − 4, 1E − 4, 5E − 4, 1E − 3}. The batch sizes for both search and
recommendation (see |b | and |b ′| in Section 2.2) were selected from
{32, 64, 128, 256}. The dropout keep probability was selected from
{0.5, 0.8, 1.0}. The word and user embedding dimensionalities were
set to 200 and the word embedding matrix was initialized by the
GloVe vectors [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] trained on Wikipedia 2014 and Gigawords 5.7
3.3
        </p>
      </sec>
      <sec id="sec-6-9">
        <title>Evaluation Metrics</title>
        <p>To evaluate the retrieval model, we use mean average precision
(MAP) of the top 100 retrieved items and normalized discounted
cumulative gain (NDCG) of the top 10 retrieved items (NDCG@10).
To evaluate the recommendation performance, we use NDCG, hit
ratio (Hit), and recall. The cut-of for all recommendation metrics is
10. Hit ratio is defined as the ratio of users that are recommended
at least one relevant item.
3.4</p>
      </sec>
      <sec id="sec-6-10">
        <title>Results and Discussion</title>
        <p>Table 2 reports the retrieval performance for an individual retrieval
model and the one jointly learned with a recommendation model.
The results on three categories of the Amazon product dataset
demonstrate that the jointly learned model significantly
outperforms the individually trained model, in all cases. Note that the
network architecture in both models is the same and the only
difference is the way that they were trained, i.e., individual training vs.
co-training with the recommendation component. We followed the
same procedure to optimize the hyper-parameters for both models
to have a fair comparison.</p>
        <sec id="sec-6-10-1">
          <title>6https://www.tensorflow.org/ 7The pre-trained vectors are accessible via https://nlp.stanford.edu/projects/glove/.</title>
          <p>The results reported in Table 3 also show that the
recommendation model jointly learned with a retrieval model significantly
outperforms the one trained individually with the same
recommendation training data.</p>
          <p>In summary, joint modeling and optimization of search and
recommendation ofers substantial improvements in both search
ranking and recommendation tasks. This indicates the potential in joint
modeling of these two highly correlated applications.</p>
          <p>It is important to fully understand the reasons behind such
improvements. To this aim, Figure 3 plots the recommendation loss
curves on the Cell Phones &amp; Accessories training data for two
recommendation models, one trained individually and the other one
trained jointly with the retrieval model. Although the individually
learned model underperforms the joint model (see Table 3), its
recommendation loss on the training data is less (see Figure 3). Similar
observation can be made from the retrieval loss curves, which are
omitted due to the space constraints. It can be inferred that the
individually learned model overfits on the training data. Therefore,
joint training can be also used as a means to improve generalization
by prevention from overfitting.</p>
          <p>Example. Here, we provide an example to intuitively justify the
superior performance of the proposed joint modeling. Assume that a
query “iphone accessories” is submitted. Relevant products include
various types iPhone accessories including headphones, phone
cases, screen protectors, etc. However, the description and the
reviews of most of these items do not match with the term
“accessories”. This results in poor retrieval performance for a retrieval
model trained individually. On the other hand, from the
recommendation training data, users who bought iPhones, they also bought
diferent types of iPhone accessories. Therefore, the representations
learned for these items, e.g., headphones, phone cases, and screen
protectors, are close in a jointly trained model. Thus, the retrieval
performance for the query “iphone accessories” improves, when
joint training is employed.</p>
          <p>The recommender system can also benefit from the joint
modeling. For example, to a user who bought a cell phone, few headphones
that have been previously purchased together with this phone by
other users have been recommended. From the retrieval training
data, all the headphones are relevant to the query “headphones”</p>
        </sec>
      </sec>
      <sec id="sec-6-11">
        <title>CONCLUSIONS AND FUTURE DIRECTIONS</title>
        <p>In this paper, we introduced the search-recommendation joint
modeling task by providing intuitions on why jointly modeling and
optimizing search engines and recommender systems could be
useful in practical scenarios. We performed a set of preliminary
experiments to investigate the feasibility of the task and observed
substantial improvements compared to the baselines. Our
experiments also verified that joint modeling can be seen as a means to
improve generalization by prevention from overfitting. This work
smooths the path towards studying such a challenging task in
practical situations in the future.</p>
        <p>In the following, we present our insights into the
search-recommendation joint modeling task and how it can influence search
engines and recommender systems in the future.</p>
        <p>An immediate next step should be evaluating the JSR framework
in a real-world setting, where queries were issued by real users
and diferent relevance and recommendation signals (e.g., search
logs and purchase history) are available for training and evaluation.
This would guarantee the actual advantages of the proposed JSR
framework in real systems.</p>
        <p>
          Furthermore, given the importance of learning from limited data
to both academia and industry [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ], we believe that the
significance of JSR could be even greater when training data for either
search or recommendation is limited. For instance, assume that
an information system has run a search engine for a while and
gathered a large amount of user interactions with the system, and a
recommender systems has recently been added. In this case, the JSR
framework could be particularly useful for transferring the
information captured by the search logs to improve the recommendation
performance in such a cold-start setting. Even a more extreme case
would be of interest where training data for either search or
recommendation is available, but no labeled data is in hand for the
other task. On the one hand, this extreme case has several practical
advantages and enables information systems to provide both search
and recommendation functionalities when training data for only
one of these functionalities is available. On the other hand, this is a
theoretically interesting task, because this is not a typical transfer
learning problem; in transfer learning approaches, the distribution
of labeled data is often mapped to the distribution of unlabeled
target data, which cannot be applied here, since these are two diferent
problems with diferent inputs. From a theoretical point of view,
this extreme case can be viewed as a generalized version of typical
transfer learning.
        </p>
        <p>Moreover, in the JSR framework, the search and
recommendation components are learned simultaneously. Therefore, improving
one of these models (either search or recommendation) can
intuitively improve the quality of learned representations. Therefore,
this can directly afect the performance of the other task. For
example, improving the network architecture for the retrieval model
can potentially lead to improvements in the recommendation
performance. If future work verifies the correctness of this intuition,
this results in “killing two birds with one stone”.
5</p>
      </sec>
      <sec id="sec-6-12">
        <title>ACKNOWLEDGEMENTS</title>
        <p>This work was supported in part by the Center for Intelligent
Information Retrieval. Any opinions, findings and conclusions or
recommendations expressed in this material are those of the authors and
do not necessarily reflect those of the sponsor. The authors thank
Qingyao Ai, John Foley, Helia Hashemi, and Ali Montazeralghaem
for their insightful comments.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Qingyao</given-names>
            <surname>Ai</surname>
          </string-name>
          , Yongfeng Zhang, Keping Bi, Xu Chen, and
          <string-name>
            <given-names>W. Bruce</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Learning a Hierarchical Embedding Model for Personalized Product Search</article-title>
          . In SIGIR '
          <fpage>17</fpage>
          .
          <string-name>
            <surname>Shinjuku</surname>
          </string-name>
          , Tokyo, Japan,
          <fpage>645</fpage>
          -
          <lpage>654</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Trapit</given-names>
            <surname>Bansal</surname>
          </string-name>
          , David Belanger, and
          <string-name>
            <surname>Andrew McCallum</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Ask the GRU: Multi-task Learning for Deep Text Recommendations</article-title>
          . In RecSys '
          <fpage>16</fpage>
          . Boston, Massachusetts, USA,
          <fpage>107</fpage>
          -
          <lpage>114</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Nicholas</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Belkin</surname>
            and
            <given-names>W. Bruce</given-names>
          </string-name>
          <string-name>
            <surname>Croft</surname>
          </string-name>
          .
          <year>1992</year>
          .
          <article-title>Information Filtering and Information Retrieval: Two Sides of the Same Coin? Commun</article-title>
          . ACM
          <volume>35</volume>
          ,
          <issue>12</issue>
          (Dec.
          <year>1992</year>
          ),
          <fpage>29</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Alberto</given-names>
            <surname>Costa</surname>
          </string-name>
          and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Roda</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Recommender Systems by Means of Information Retrieval</article-title>
          .
          <source>In WIMS '11</source>
          .
          <string-name>
            <surname>Sogndal</surname>
          </string-name>
          , Norway, Article
          <volume>57</volume>
          ,
          <issue>57</issue>
          :
          <fpage>1</fpage>
          -
          <lpage>57</lpage>
          :5 pages.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Mostafa</given-names>
            <surname>Dehghani</surname>
          </string-name>
          , Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and
          <string-name>
            <given-names>W. Bruce</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Neural Ranking Models with Weak Supervision</article-title>
          .
          <source>In SIGIR '17</source>
          .
          <string-name>
            <surname>Shinjuku</surname>
          </string-name>
          , Tokyo, Japan,
          <fpage>65</fpage>
          -
          <lpage>74</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Jiafeng</given-names>
            <surname>Guo</surname>
          </string-name>
          , Yixing Fan,
          <string-name>
            <given-names>Qingyao</given-names>
            <surname>Ai</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W. Bruce</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>A Deep Relevance Matching Model for Ad-hoc Retrieval</article-title>
          .
          <source>In CIKM '16</source>
          . Indianapolis, Indiana, USA,
          <fpage>55</fpage>
          -
          <lpage>64</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Ruining</given-names>
            <surname>He</surname>
          </string-name>
          and
          <string-name>
            <surname>Julian McAuley</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering</article-title>
          .
          <source>In WWW '16</source>
          .
          <string-name>
            <surname>Montréal</surname>
          </string-name>
          , Québec, Canada,
          <fpage>507</fpage>
          -
          <lpage>517</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Xiangnan</given-names>
            <surname>He</surname>
          </string-name>
          , Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and
          <string-name>
            <surname>Tat-Seng Chua</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Neural Collaborative Filtering</article-title>
          .
          <source>In WWW '17</source>
          .
          <string-name>
            <surname>Perth</surname>
          </string-name>
          , Australia,
          <fpage>173</fpage>
          -
          <lpage>182</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Kalervo</given-names>
            <surname>Järvelin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jaana</given-names>
            <surname>Kekäläinen</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Cumulated Gain-based Evaluation of IR Techniques</article-title>
          .
          <source>ACM Trans. Inf. Syst</source>
          .
          <volume>20</volume>
          ,
          <issue>4</issue>
          (Oct.
          <year>2002</year>
          ),
          <fpage>422</fpage>
          -
          <lpage>446</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Alexandros</surname>
            <given-names>Karatzoglou</given-names>
          </string-name>
          , Linas Baltrunas, and
          <string-name>
            <given-names>Yue</given-names>
            <surname>Shi</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Learning to Rank for Recommender Systems</article-title>
          . In RecSys '13.
          <string-name>
            <surname>Hong</surname>
            <given-names>Kong</given-names>
          </string-name>
          , China,
          <fpage>493</fpage>
          -
          <lpage>494</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Diederik</given-names>
            <surname>Kingma</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jimmy</given-names>
            <surname>Ba</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Adam: A Method for Stochastic Optimization</article-title>
          .
          <source>arXiv preprint arXiv:1412.6980</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Victor</given-names>
            <surname>Lavrenko</surname>
          </string-name>
          and
          <string-name>
            <given-names>W. Bruce</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Relevance Based Language Models</article-title>
          .
          <source>In SIGIR '01</source>
          . New Orleans, Louisiana, USA,
          <fpage>120</fpage>
          -
          <lpage>127</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Victor</surname>
            <given-names>Lavrenko</given-names>
          </string-name>
          , Matt Schmill, Dawn Lawrie, Paul Ogilvie, David Jensen,
          <string-name>
            <given-names>and James</given-names>
            <surname>Allan</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Language Models for Financial News Recommendation</article-title>
          .
          <source>In CIKM '00</source>
          .
          <string-name>
            <surname>McLean</surname>
          </string-name>
          , Virginia, USA,
          <fpage>389</fpage>
          -
          <lpage>396</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Pasquale</surname>
            <given-names>Lops</given-names>
          </string-name>
          , Marco de Gemmis, and
          <string-name>
            <given-names>Giovanni</given-names>
            <surname>Semeraro</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Content-based Recommender Systems: State of the Art</article-title>
          and Trends. Springer US, Boston, MA,
          <fpage>73</fpage>
          -
          <lpage>105</lpage>
          . DOI:http://dx.doi.org/10.1007/978-0-
          <fpage>387</fpage>
          -85820-
          <issue>3</issue>
          _
          <fpage>3</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Julian</surname>
            <given-names>McAuley</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Targett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Qinfeng</given-names>
            <surname>Shi</surname>
          </string-name>
          , and Anton van den Hengel.
          <year>2015</year>
          .
          <article-title>Image-Based Recommendations on Styles and Substitutes</article-title>
          .
          <source>In SIGIR '15</source>
          .
          <string-name>
            <surname>Santiago</surname>
          </string-name>
          , Chile,
          <fpage>43</fpage>
          -
          <lpage>52</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Bhaskar</surname>
            <given-names>Mitra</given-names>
          </string-name>
          , Fernando Diaz, and
          <string-name>
            <given-names>Nick</given-names>
            <surname>Craswell</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Learning to Match Using Local and Distributed Representations of Text for Web Search</article-title>
          . In WWW '
          <fpage>17</fpage>
          .
          <string-name>
            <surname>Perth</surname>
          </string-name>
          , Australia,
          <fpage>1291</fpage>
          -
          <lpage>1299</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Javier</surname>
            <given-names>Parapar</given-names>
          </string-name>
          , Alejandro Bellogín, Pablo Castells, and
          <string-name>
            <given-names>Álvaro</given-names>
            <surname>Barreiro</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Relevance-based Language Modelling for Recommender Systems</article-title>
          . Inf. Process. Manage.
          <volume>49</volume>
          ,
          <issue>4</issue>
          (
          <year>July 2013</year>
          ),
          <fpage>966</fpage>
          -
          <lpage>980</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Jefrey</surname>
            <given-names>Pennington</given-names>
          </string-name>
          , Richard Socher, and
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>GloVe: Global Vectors for Word Representation</article-title>
          .
          <source>In EMNLP '14</source>
          .
          <string-name>
            <surname>Doha</surname>
          </string-name>
          , Qatar,
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Jay</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ponte</surname>
            and
            <given-names>W. Bruce</given-names>
          </string-name>
          <string-name>
            <surname>Croft</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>A Language Modeling Approach to Information Retrieval</article-title>
          .
          <source>In SIGIR '98</source>
          .
          <string-name>
            <surname>Melbourne</surname>
          </string-name>
          , Australia,
          <fpage>275</fpage>
          -
          <lpage>281</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Hossein</given-names>
            <surname>Rahmatizadeh</surname>
          </string-name>
          <string-name>
            <surname>Zagheli</surname>
          </string-name>
          , Hamed Zamani, and
          <string-name>
            <given-names>Azadeh</given-names>
            <surname>Shakery</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A Semantic-Aware Profile Updating Model for Text Recommendation</article-title>
          . In RecSys '17.
          <string-name>
            <surname>Como</surname>
          </string-name>
          , Italy,
          <fpage>316</fpage>
          -
          <lpage>320</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Jennifer</given-names>
            <surname>Rowley</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Product Search in e-Shopping: A Review and Research Propositions</article-title>
          .
          <source>Journal of Consumer Marketing</source>
          <volume>17</volume>
          ,
          <issue>1</issue>
          (
          <year>2000</year>
          ),
          <fpage>20</fpage>
          -
          <lpage>35</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Markus</surname>
            <given-names>Schedl</given-names>
          </string-name>
          , Hamed Zamani,
          <string-name>
            <surname>Ching-Wei</surname>
            <given-names>Chen</given-names>
          </string-name>
          , Yashar Deldjoo, and
          <string-name>
            <given-names>Mehdi</given-names>
            <surname>Elahi</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Current Challenges and Visions in Music Recommender Systems Research</article-title>
          .
          <source>International Journal of Multimedia Information Retrieval (05 Apr</source>
          <year>2018</year>
          ). DOI:http://dx.doi.org/10.1007/s13735-018-0154-2
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Christophe</surname>
            <given-names>Van Gysel</given-names>
          </string-name>
          , Maarten de Rijke, and
          <string-name>
            <given-names>Evangelos</given-names>
            <surname>Kanoulas</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Learning Latent Vector Spaces for Product Search</article-title>
          .
          <source>In CIKM '16</source>
          . Indianapolis, Indiana, USA,
          <fpage>165</fpage>
          -
          <lpage>174</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Jun</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arjen P. de Vries</surname>
            , and
            <given-names>Marcel J. T.</given-names>
          </string-name>
          <string-name>
            <surname>Reinders</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>A User-item Relevance Model for Log-based Collaborative Filtering</article-title>
          .
          <source>In ECIR'06</source>
          . London, UK,
          <fpage>37</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Jun</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arjen P. de Vries</surname>
            , and
            <given-names>Marcel J. T.</given-names>
          </string-name>
          <string-name>
            <surname>Reinders</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Unified Relevance Models for Rating Prediction in Collaborative Filtering</article-title>
          .
          <source>ACM Trans. Inf. Syst</source>
          .
          <volume>26</volume>
          ,
          <issue>3</issue>
          ,
          <string-name>
            <surname>Article 16</surname>
          </string-name>
          (
          <year>June 2008</year>
          ),
          <volume>16</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          :42 pages.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Hamed</given-names>
            <surname>Zamani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. Bruce</given-names>
            <surname>Croft</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. Shane</given-names>
            <surname>Culpepper</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Neural Query Performance Prediction Using Weak Supervision from Multiple Signals</article-title>
          . In SIGIR '
          <volume>18</volume>
          .
          <fpage>105</fpage>
          -
          <lpage>114</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Hamed</surname>
            <given-names>Zamani</given-names>
          </string-name>
          , Javid Dadashkarimi, Azadeh Shakery, and
          <string-name>
            <given-names>W. Bruce</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Pseudo-Relevance Feedback Based on Matrix Factorization</article-title>
          .
          <source>In CIKM '16</source>
          . Indianapolis, Indiana, USA,
          <fpage>1483</fpage>
          -
          <lpage>1492</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Hamed</surname>
            <given-names>Zamani</given-names>
          </string-name>
          , Mostafa Dehghani, Fernando Diaz,
          <string-name>
            <given-names>Hang</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Nick</given-names>
            <surname>Craswell</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>SIGIR 2018 Workshop on Learning from Limited or Noisy Data for Information Retrieval</article-title>
          . In SIGIR '
          <volume>18</volume>
          .
          <fpage>1439</fpage>
          -
          <lpage>1440</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Hamed</surname>
            <given-names>Zamani</given-names>
          </string-name>
          , Bhaskar Mitra, Xia Song, Nick Craswell, and
          <string-name>
            <given-names>Saurabh</given-names>
            <surname>Tiwary</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Neural Ranking Models with Multiple Document Fields</article-title>
          .
          <source>In WSDM '18. Marina Del Rey</source>
          , CA, USA,
          <fpage>700</fpage>
          -
          <lpage>708</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>Hamed</given-names>
            <surname>Zamani</surname>
          </string-name>
          and
          <string-name>
            <given-names>Azadeh</given-names>
            <surname>Shakery</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>A Language Model-based Framework for Multi-publisher Content-based Recommender Systems</article-title>
          .
          <source>Information Retrieval Journal (06 Feb</source>
          <year>2018</year>
          ). DOI:http://dx.doi.org/10.1007/s10791-018-9327-0
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Yongfeng</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Qingyao Ai,
          <string-name>
            <given-names>Xu</given-names>
            <surname>Chen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W. Bruce</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Joint Representation Learning for Top-N Recommendation with Heterogeneous Information Sources</article-title>
          .
          <source>In CIKM '17</source>
          .
          <string-name>
            <surname>Singapore</surname>
          </string-name>
          , Singapore,
          <fpage>1449</fpage>
          -
          <lpage>1458</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>