<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>August</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Transfer Learning from APP Domain to News Domain for Dual Cold-Start Recommendation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>CCS Concepts</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College of Computer Science and Software Engineering, Shenzhen University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science and Engineering, Hong Kong University of Science and Technology</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Transfer Learning; News Recommendation; Cold-Start Recommendation</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <volume>27</volume>
      <issue>2017</issue>
      <fpage>38</fpage>
      <lpage>41</lpage>
      <abstract>
        <p>News recommendation has been a must-have service for most mobile device users to know what has happened in the world. In this paper, we focus on recommending latest news articles to new users, which consists of the new user coldstart challenge and the new item (i.e., news article) coldstart challenge, and is thus termed as dual cold-start recommendation (DCSR). As a response, we propose a solution called neighborhood-based transfer learning (NTL) for this new problem. Specifically, in order to address the new user cold-start challenge, we propose a cross-domain preference assumption, i.e., users with similar app-installation behaviors are likely to have similar tastes in news articles, and then transfer the knowledge of neighborhood of the coldstart users from an APP domain to a news domain. For the new item cold-start challenge, we design a category-level preference to replace the traditional item-level preference because the latter is not applicable for the new items in our problem. We then conduct empirical studies on a real industry data with both users' app-installation behaviors and news-reading behaviors, and find that our NTL is able to deliver the news articles more accurately than other methods on different ranking-oriented evaluation metrics.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Intelligent recommendation systems [5] have been a
ubiquitous service in our daily life, which has saved us a lot of
time in finding proper information such as music, goods and
news articles. For instance, personalized news
recommendation [1, 2] has been one of the must-have services for most
mobile device users, which plays an important role in
helping users keep up with the current affairs in the world. In
this paper, we focus on recommending latest news articles
to new users, i.e., the users are newly registered in a certain
news recommendation service and have not read any news
articles before, and the news articles have not been read by
any users before. We term it as dual cold-start
recommendation (DCSR), denoting both cold-start users and cold-start
items.</p>
      <p>For the dual cold-start problem, previous news
recommendation methods [1, 2] are not applicable, because they rely
on users’ historical reading behaviors and news articles’
content information that are not available in our case.</p>
      <p>We turn to address the cold-start recommendation
problem from a transfer learning [3, 4] view. Although there are
no users’ behaviors about the cold-start users and cold-start
items in the news domain, there may be some other
related domains with users’ behaviors. Specifically, we leverage
some knowledge from a related domain, i.e., APP domain,
where the users’ app-installation behaviors are available. We
find that most cold-start users in the news domain have
already installed some apps, and that may be helpful in
determining his/her preferences in news articles. In particular,
we assume that users with similar app-installation behaviors
are likely to have similar interests in some news topics. In
other words, close neighbors in the APP domain are likely
to be close neighbors in the news domain.</p>
      <p>With the above cross-domain preference assumption, we
propose to take the neighborhood in the APP domain as
the knowledge and try to transfer it to the target domain of
news articles. Specifically, we design a neighborhood-based
transfer learning (NTL) solution that transfers knowledge
of neighborhood from the APP domain to the news domain,
which addresses the new user cold-start challenge. With the
neighborhood, some well-studied neighborhood-based
recommendation methods are applicable for news
recommendation.</p>
      <p>We conduct empirical studies on a real industry data in
order to verify our cross-domain preference assumption and
the effectiveness of our transfer learning solution.
Experimental results show that the two domains of apps and news
articles are indeed related and can share some knowledge for
preference learning.</p>
    </sec>
    <sec id="sec-2">
      <title>OUR SOLUTION</title>
      <p>tion. Mathematically, the preference prediction rule for user
u to item i is as follows,</p>
      <p>In our studied news recommendation problem, we have
two domains, including an APP domain and a news domain.</p>
      <p>Firstly, in the APP domain, we have a set of triples, i.e.,
(u, g, Gug), denoting that user u has installed Gug times of
mobile apps belonging to the genre g. The data of the APP
domain can then be represented as a user-genre matrix G
as shown in Figure 1.</p>
      <p>Secondly, in the news domain, we have a user-item
matrix R denoting whether a user has read an item. Each
item i is associated with a level-1 category c1(i) and a
level2 category c2(i). We thus have a set of quadruples, i.e.,
(u, i, c1(i), c2(i)), denoting that user u has read an item i
belonging to c1(i) and c2(i). Finally, we have a user-category
matrix C after pre-processing, where each entry denotes the
number of items belonging to a certain category that a user
has read.</p>
      <p>Our goal is to recommend a ranked list of new items (i.e.,
latest news articles) to each new user, who has not read any
items before. We can see that it is a new user cold-start
and new item cold-start problem, which is thus termed as
dual cold-start recommendation (DCSR). Note that we only
make use of items’ category information, but not content
information.</p>
      <p>We put some notations in Table 1.
2.2</p>
    </sec>
    <sec id="sec-3">
      <title>Challenges</title>
      <p>The main difficulty of the DCSR problem is the lack of
preference data for new users and new items. Specifically,
there are two challenges, including (i) the new user cold-start
challenge, i.e., the target users (to whom we will provide
recommendations) have not read any items before; and (ii) the
new item cold-start challenge, i.e., the target items (that we
will recommend to the target users) are totally new for all
users. Under such a situation, most existing
recommendation algorithms are not applicable.
2.3</p>
    </sec>
    <sec id="sec-4">
      <title>Neighborhood-based Transfer Learning</title>
      <p>
        In most recommendation methods [5], the user-user (or
item-item) similarity is a central concept, because the
neighborhood can be constructed for like-minded users’ preference
aggregation and then for the target user’s preference
predic(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
(6)
(7)
which will be used for preference prediction in our
empirical studies. Specifically, the neighborhood Nu addresses the
new user cold-start challenge, and the category-level
preference Nu′,c1(i) or Nu′,c2(i) addresses the new item cold-start
challenge.
      </p>
      <p>rˆu,i =
1</p>
      <p>X rˆu′,i,
|Nu| u′∈Nu
where Nu is a set of nearest neighbors of user u in terms of a
certain similarity measurement such as cosine similarity, and
rˆu′,i is the estimated preference of user u′ (a close neighbor
of user u) to item i. The aggregated and normalized score
rˆu,i is taken as the preference of user u to item i, which is
further used for item ranking and top-K recommendation.</p>
      <p>For our studied dual cold-start recommendation problem,
we can not build correlations between a cold-start user in the
test data and a warm-start user in the training data using
the data from the news domain only. The main idea of our
transfer learning [3] solution is to leverage the correlations
among the users in the APP domain with the assumption
that users with similar app-installation behaviors are likely
to be similar in news taste. For instance, two users with the
installed apps of the same genre business may both prefer
news articles on topics like finance.</p>
      <p>With the cross-domain preference assumption, we first
calculate the cosine similarity between a cold-start user u and
a warm-start user u′ in the APP domain as follows,
su,u′ =</p>
      <p>Gu·GTu′·
pGu·GTu·qGu′·GTu′·
where Gu· is a row vector w.r.t. user u from the user-genre
matrix G. Once we have calculated the cosine similarity,
for each cold-start user u, we first remove users with a small
similarity value (e.g., su,u′ &lt; 0.1), and then take some (e.g.,
100) most similar users to construct a neighborhood Nu.</p>
      <p>
        For the item-level preference rˆu′,i in Eq.(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), we are not
able to have such a score directly because the item i is new
for all users, including the warm-start users and the target
cold-start user u′. We thus propose to approximate the
itemlevel preference using a category-level preference,
rˆu′,i ≈ rˆu′,c(i),
where c(i) can be the level-1 category or level-2 category.
We then have two types of category-level preferences,
rˆu′,c(i)
rˆu′,c(i)
= rˆu′,c1(i) = Nu′,c1(i),
= rˆu′,c2(i) = Nu′,c2(i),
where Nu′,c1(i) and Nu′,c2(i) denote the number of read items
(by user u′) belonging to the level-1 category c1(i) and the
level-2 category c2(i), respectively.
      </p>
      <p>
        Finally, with the Eqs.(
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3-5</xref>
        ), we can rewrite Eq.(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) as
follows,
rˆu,i ≈
rˆu,i ≈
1
1
      </p>
      <p>X
|Nu| u′∈Nu</p>
      <p>X
|Nu| u′∈Nu</p>
      <p>Nu′,c1(i),
Nu′,c2(i),</p>
    </sec>
    <sec id="sec-5">
      <title>EXPERIMENTAL RESULTS</title>
    </sec>
    <sec id="sec-6">
      <title>Dataset and Evaluation Metrics</title>
      <p>In our empirical studies, we use a real industry data, which
consists of an APP domain and a news domain.
APP Domain In the auxiliary domain, i.e., APP domain,
we have 827,949 users and 53 description terms (i.e.,
genres) of the users’ installed mobile apps, where the genres are
from Google Play. Considering our target task of news
recommendation, we removed 14 undiscriminating or irrelevant
genres such as tools, communication, social, entertainment,
productivity, weather, dating, etc. Finally, we have a matrix
G with 827,949 users (or rows) and 39 genres (or columns),
where each entry represents the number of times that a user
has installed apps belonging to a genre.</p>
      <p>News Domain In the target domain, i.e., news domain,
we have two sets of data, including a training data and a
test data. The training data spans from 10 January 2017 to
30 January 2017, and contains 806,167 users, 747,643 items
(i.e., news articles), and 16,199,385 unique (user, item) pairs.
We can see that a user has read about 16199385/806167 =
20.09 articles on average from 10 January 2017 to 30 January
2017. The test data are from the data on 31 January 2017,
which contains 3,597 new users, 28,504 new items (i.e., news
articles), and 4,813 unique (user, item) pairs. We can see
that a cold-start user read about 4813/3597 = 1.34 articles
on 31 January 2017. Note that we have |C1| = 26 level-1
categories and |C2| = 222 level-2 categories about the items
in the news domain.</p>
      <p>For performance evaluation, we adopt some commonly
used evaluation metrics in ranking-oriented recommendation
such as precision, recall, F1, NDCG and 1-call. Specifically,
we study the average performance of the top-15
recommended list generated for each cold-start user in the test data.
3.2</p>
    </sec>
    <sec id="sec-7">
      <title>Baselines and Parameter Settings</title>
      <p>We compare our proposed transfer learning solution with
a random method and two popularity-based methods using
category information.</p>
      <p>• Random recommendation (Random). In Random,
we randomly select K = 15 items in the test data for
each cold-start user.
• Popularity-based ranking via level-1 category
(PopRankC1). In PopRank-C1, we first calculate the
popularity pc1 of each level-1 category c1 ∈ C1 in the training
data, and then use rˆi = pc1(i) in Table 1 as the score
to rank each item (i.e., article) i in the test data. For
the most popular level-1 category, there may be more
than K = 15 items (i.e., articles) in the test data, we
then randomly take K items (i.e., articles) from that
level-1 category for recommendation.
• Popularity-based ranking via level-2 category
(PopRankC2). In PopRank-C2, we use rˆi = pc2(i) in Table 1
as the prediction rule similar to that of PopRank-C1.</p>
      <p>For the number of neighbors in our neighborhood-based
transfer learning method, we first fix it as 100, and then
change it to 50 and 150 in order to study its impact. We
denote our transfer learning solution with level-1 category as
NTL-C1 and that with level-2 category as NTL-C2, where
their prediction rules are shown in Eq.(6) and Eq.(7),
respectively. Note that for Random, PopRank-C1, PopRank-C2,
and NTL with randomly selected neighbors, we repeat the
experiments for 10 times, and report the average results.
3.3</p>
    </sec>
    <sec id="sec-8">
      <title>Results</title>
      <p>We report the main results in Table 2. From Table 2, we
can have the following observations:
• The overall performance ordering is PopRank-C1,
Random, PopRank-C2 ≪ NTL-C2 &lt; NTL-C1, which
clearly shows the effectiveness of our proposed transfer
learning solution to the challenging dual cold-start
recommendation problem.
• The performance of PopRank-C2 and PopRank-C1 are
rather poor in comparison with our proposed solution.
The reason is that popularity-based methods are
nonpersonalized methods and will simply select one most
popular level-1 category or level-2 category for all
users, which ignores the difference in users’ news reading
preferences.
• For the relative performance of NTL-C1 and NTL-C2,
we can see that NTL-C1 performs better as
expected because the level-1 category may introduce more
smoothing effect for the cold-start problem.</p>
      <p>We further study the impact of the neighborhood size.
The results of our NTL-C1 using 50, 100 and 150 neighbors
are shown in Figure 2. We can see that the results are
relatively stable with different numbers of neighbors, and
configuring it as 100 usually produces the best performance.
0.08
0
50</p>
      <p>100
Neighborhood size
150</p>
      <p>In order to gain some deep understanding about the
transferred neighborhood, we study the performance of
randomly choosing the same number (i.e., 100) of neighbors in our
NTL-C1. We report the results in Figure 3, from which we
can have the observations:
• The neighborhood constructed using the app-installation
behaviors is better than that of the random
counterpart, which shows that the two domains are related
and can indeed transfer knowledge from one domain
to the other.
• The difference between the two types of neighborhood
is not as large as that between popularity-based
methods and our NTL in Table 2, which can be explained
by the fact that a large portion of users’ preferences or
tastes in news articles are similar.</p>
    </sec>
    <sec id="sec-9">
      <title>CONCLUSIONS AND FUTURE WORK</title>
      <p>In this paper, we study an important and challenging news
recommendation problem called dual cold-start
recommendation (DCSR), which aims to recommend latest news
articles (cold-start items) to newly registered users (cold-start
users). Specifically, we propose a neighborhood-based
transfer learning (NTL) solution, which is able to address the new
user cold-start challenge and the new item cold-start
challenge by the transferred neighborhood from the APP domain
and the category-level preferences in the news domain,
respectively. Empirical results on a real industry data show
that our NTL performs significantly more accurate than the
few applicable methods, i.e., popularity-based ranking using
category information.</p>
      <p>For future works, we are interested in selecting some
representative genres and categories in two domains and
building a mapping between them, which will be further used to
study the neighborhood of the items.</p>
    </sec>
    <sec id="sec-10">
      <title>ACKNOWLEDGMENT</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>A. S. Das</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Datar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Garg</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Rajaram</surname>
          </string-name>
          .
          <article-title>Google news personalization: Scalable online collaborative filtering</article-title>
          .
          <source>In Proceedings of the 16th International Conference on World Wide Web, WWW '07</source>
          , pages
          <fpage>271</fpage>
          -
          <lpage>280</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dolan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. R.</given-names>
            <surname>Pedersen</surname>
          </string-name>
          .
          <article-title>Personalized news recommendation based on click behavior</article-title>
          .
          <source>In Proceedings of the 15th International Conference on Intelligent User Interfaces</source>
          ,
          <source>IUE '10</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>40</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Pan</surname>
          </string-name>
          and
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <article-title>A survey on transfer learning</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          ,
          <volume>22</volume>
          (
          <issue>10</issue>
          ):
          <fpage>1345</fpage>
          -
          <lpage>1359</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>W.</given-names>
            <surname>Pan</surname>
          </string-name>
          .
          <article-title>A survey of transfer learning for collaborative recommendation with auxiliary data</article-title>
          .
          <source>Neurocomputing</source>
          ,
          <volume>177</volume>
          :
          <fpage>447</fpage>
          -
          <lpage>453</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Ricci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rokach</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Shapira</surname>
          </string-name>
          .
          <source>Recommender Systems Handbook (Second Edition)</source>
          . Springer,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>