<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Combining Models for Beter User Satisfaction in Video Recommendation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Josef Florian</string-name>
          <email>josef.florian@firma.seznam.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karel Koupil</string-name>
          <email>karel.koupil@firma.seznam.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jakub Drdák</string-name>
          <email>jakub.drdak@firma.seznam.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Václav Blahut</string-name>
          <email>vaclav.blahut@firma.seznam.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michal Řehoř</string-name>
          <email>michal.rehor@firma.seznam.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Radek Tomšů</string-name>
          <email>radek.tomsu@firma.seznam.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jaroslav Kuchař</string-name>
          <email>jaroslav.kuchar@firma.seznam.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Reference Format:</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Josef Florian</institution>
          ,
          <addr-line>Jakub Drdák, Radek Tomšů, Karel Koupil, Václav Blahut, Jaroslav Kuchař, and Michal Řehoř. 2020. Combining Models for Better, User Satisfaction in Video Recommendation. In 3rd Workshop on Online, Recommender Systems and User Modeling (ORSUM 2020), in conjunction with, the 14th ACM Conference on Recommender Systems, September 25th, 2020, Virtual Event</addr-line>
          ,
          <country country="BR">Brazil.</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Seznam.cz</institution>
          ,
          <addr-line>Prague</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <abstract>
        <p>Watch time has been a subject of interest for recommender systems in recent years. Music, podcast and video recommendations based on or amplified by consumption time optimization are often employed to boost perceived user satisfaction, subscriptions, engagement or to decrease number of bounces. Finding a fragile balance between several diferent metrics describing users ' behaviour might be challenging, especially in the area of video recommendation. In this paper, we design online algorithms for modelling relationship between click probability and expected watch time on a video. We explore means of combining and balancing click-based and watch time optimizing models in an online multi-criteria setting. We present experiments that involve watch time, CTR and watch ratio. Furthermore, the paper describes empirical evaluations on live trafic and illustrates that our approach has succeeded in outperforming a non-trivial baseline in a controlled manner.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Information systems → Recommender systems; •
Computing methodologies → Machine learning.
multi-criteria optimization, Dwell time, video recommendation</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>Recommendation systems play a significant role in today 's world
of information overload. Users have only a limited amount of time
they want to spend on consuming online content. Therefore
media providers are striving to deliver the most relevant content. As
Attfield et al. mentioned in [ 3], user engagement is the emotional,
cognitive and behavioural connection that exists, at any point in
time and possibly over time, between a user and a resource. We
also believe that user engagement utilization has become crucial
for any recommender system. In our use case, users are exposed to
a personalized selection of online media services, including news,
video content, lifestyle, sport, weather forecast and celebrities etc.</p>
      <p>In this paper, we address a problem of recommending videos for a
content box, shown in Figure 1. Users rarely provide explicit ratings
or direct feedback when consuming frequently updated online
content. Optimizing algorithm for CTR metric (Click Through Rate,
more in part 4) is straightforward, however, it does not capture any
post-click user behaviour and might lead to promoting click-bait
articles and low-quality content. On the other hand, dwell-time, the
time spent on a web page is one such metric and has proven to be a
meaningful and reliable metric of user engagement in the context
of recommendation tasks [11]. Instead of using the dwell-time on
the video page as a whole, we decided to utilize only the time spent
watching the video itself, measured via the video player. Favourably,
we know the time duration of the video content, we can compute
relative watch time w.r.t. length of the video.</p>
      <p>Our main KPIs (Key Performance Indicators) for users’
satisfaction include total watch time (TTS = total time spent) users have
spent on a video service together with CTR. Our objective is to
maximize user engagement while still considering number of video
views.</p>
      <p>During experiments, we try first to proxy user satisfaction only
via watch time, although this approach leads to a vast harm on
CTR. Then we continue with finding a balanced trade-of between
CTR and watch time on a video while maximizing users’ TTS and
reducing harm of CTR. For this purpose, we model the relationship
between CTR and watch time using either parameterized
multiplication or parameterized exponential function.</p>
      <p>In this paper, we make the following contributions:
• We suggest two diferent approaches for modelling the
relationship between click probability and watch time or watch
ratio respectively (Section 3),
• we propose a solution based on generalized linear models
suitable for an industry setting (Section 4),
• we evaluate our approach on live trafic (Section 4).
2</p>
    </sec>
    <sec id="sec-3">
      <title>RELATED WORK</title>
      <p>Several related papers have already analyzed a fundamental
question: How to measure time users spend consuming content (also
referred to as dwell time or, in our case, watch time) and how it
can be utilized in the context of personalization. Yi et al. in [11]
explore item-level dwell-time as a proxy of relevance of a content
item to a particular user and argue that the amount of time that
users spend on content items is an important metric to measure
user engagement and should be used as a proxy to user
satisfaction for recommended content, complementing and/or replacing
click-based signals.</p>
      <p>Agichtein et al. [2] has achieved substantially higher precision
and recall by considering the comprehensive UserBehaviour features
that model user interactions after the search and beyond the initial
click rather than considering click-through alone.</p>
      <p>Kim et al. [6] study relation between user satisfaction and
clickdwell time (i.e. the time user spends on a clicked result) in terms
of web search. They argue that employing previously used fixed
duration threshold [4] [10] such as 30 seconds to determine user
(dis)satisfaction is inefective due to variety of query-click attributes,
factors such as query type, page topic, page content, and page
readability level. Instead, they model (dis-)satisfied clicks’ watch
time distributions w.r.t. to diferent segments of clicks and show
how such features can improve prediction performance.</p>
      <p>Authors of this paper use TTS, which depends on a length of the
videos consumed by users, as one of KPI metrics. On the contrary,
Lagun et al. in [7] have metrics which do not depend on the amount
of content item has, but instead on the proportion of item consumed
by users, making it easier to compare item with diferent amount
of content.</p>
      <p>There are various methods of combining multiple, often
conflicting, objectives to train the model with. Rodriguez et al. in [9] use
semantic match, a prediction from an existing recommender system
trained on CTR, as the base for the final recommendation, which
is then boosted by linear combination with indicating coeficients
of additional binary relevance features. Finding optimal values for
the coeficients is thereafter an optimization problem, for a small
number of additional features shown to be solvable via grid search.
More recently, Zhao et al. [12] focused on recommending what
video to watch next on a large-scale online video-sharing platform,
similar to the setup of this paper. Their multimodal objectives are
split into two groups, engagement (such as clicks or watches) and
satisfaction (likes, ratings). Each of those objectives is represented
as one of Multi-gate Mixture-of-Experts [8] outputs, which are later
combined using weighted multiplication, with weights manually
tuned for a desired trade-of between each of the objectives.
Diversity, novelty, or aspects of fairness have also been considered as
features in multi-objective optimization [1] [5].
3</p>
    </sec>
    <sec id="sec-4">
      <title>ADDRESSING MULTI-CRITERIA</title>
    </sec>
    <sec id="sec-5">
      <title>OPTIMIZATION</title>
      <p>To maximize TTS, we need to attain fragile balance between videos
with high CTR and the ones having potential to gain high watch
time. If one naively tries to increase TTS, for instance by
recommending very long videos without considering CTR, they may
not be watched and final TTS may sufer. On the other hand,
lessrelevant content might acquire high CTR due to e.g. gratuitously
catchy title, which results in early leaves, leading to loss of TTS as
well.</p>
      <p>We consider several methods that consist of single models that
are combined and compared later to compound models. The single
models are based on generalized linear models (GLM). Specifically,
we experimented with the following models:</p>
      <p>Log logistic regression (click probability)</p>
      <p>Lin linear regression (watch ratio and watch time)
Poiss Poisson regression (number of watched parts)
The Log model is a GLM with log-odds link function. The model
considers clicks that resulted in played videos as positive examples.
All other recommended videos are considered as negative examples.
The Lin models are GLMs with identity link functions. In the former
case, the training samples have labels representing the ratios of
consumed times for target videos w.r.t. their durations. In the latter
case, the labels represent the total watched time in seconds for the
target videos. The Poiss model is a GLM with natural logarithm
link function. The labels are the number of watch parts sent by the
video player every 10 seconds when a video is being played.</p>
      <p>
        We try to use certain combination of these single models and
optimize them both on the click probability and on the watch time
spent on a particular video. Let us assume user u ∈ U and video
v ∈ V . Having the click probability c and the estimation of watch
time t , we examine the following ways of combining c and t to
obtain the resulting score:
s1 (c, t , α ) = cα · t 1−α
s2 (c, t , β ) = t (c β )
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
      </p>
      <p>
        Variable c represents probability, thus its values are bound
between 0 and 1. Variable t represents watch time on some video
and its value can be any non-negative real number. Parameter α
regulates the trade-of between c and t . It makes sense to restrict α
between 0 and 1. Similarly, β regulates the importance of c and t
for equation (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ). We restrict β to be any non-negative real number.
      </p>
      <p>
        We can assume the following function domains, Ds1 = [0, 1] ×
[0, +∞) × (
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ) and Ds2 = [0, 1] × [0, +∞) × (0, +∞). It can be easily
seen that both functions (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) are growing in both c and t
on given domain interiors. While function (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) grows like a power
function of α , function (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) grows exponentially for fixed t , thus it
is more sensitive to subtle diferences in CTR (see Figure 2).
      </p>
      <p>Having scores for the particular user and all recommendation
candidates, we sort these videos by the score in descending order.
Top N videos are then recommended to the user.
4</p>
    </sec>
    <sec id="sec-6">
      <title>EXPERIMENTS</title>
      <p>Seznam.cz is a Czech technology company specialized in
internetrelated services with a multitude of products. More than 95% of
Czech internet users visit a Seznam website every week. Its products
include a web portal, search engine, news service, email, advertising
platform or map service with interactive panoramas of streets, rural
roads and parks.</p>
      <p>Televize Seznam is a video service available via a web browser,
both mobile &amp; smart TV apps and digital terrestrial, cable &amp; satellite
broadcasting. It holds 110,000+ videos with 9000+ hours of content.
Our paper focuses on Televize Seznam’s recommendation box on
Seznam.cz’s web portal with its 3.5 million unique users daily.</p>
      <p>Our platform has to manage 4000+ requests per second at peak
hours with sub-100-millisecond latency for Seznam.cz’s web portal
box. To provide personalized recommendations to our users, we
mainly use the Vowpal Wabbit machine learning system. With a
help of subwabbit1 library, our application server (based on Python
Tornado framework) delivers the recommendations via back end.
Subsequent events of users are queued in Kafka. Data are stored in
Couchbase, MariaDB, ElasticSearch and OpenStack Swift.
1https://pypi.org/project/subwabbit/
4.1</p>
    </sec>
    <sec id="sec-7">
      <title>Data for Experiments</title>
      <p>Our data contains detailed information about user features and
videos, recorded at the time of recommendation. Videos among
others contain information about the channels they are published
from. They can have several human-generated tags. Each video
has a duration and time of publication. User features consist of
information about previously watched videos, user sex, age and a
user profile created from the user’s history collected from visits to
other related web sites. The data also contains user interactions:
whether a user clicked, whether a user started playing a video and
for how long the video was played.</p>
      <p>In a summary, over course of each week, the data contains
information about 1.5M active users, 2.5M clicks, 700 recommendable
videos. Our catalogue of videos is very diverse in a sense that it
contains both minute- and hours-long videos, highly attention
attractive videos, news videos, online TV-series and also full movies.
4.2</p>
    </sec>
    <sec id="sec-8">
      <title>Setup and Evaluation</title>
      <p>
        Single models are trained from the combinations of video and user
features in order to learn the afinity between the users and the
videos. Compound models combine outputs from single models
using combination functions (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ). From now on we’ll call a
compound model using function (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) as α -compound model and a
model using function (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) as β -compound model. In order to explore
the relationship between the hyper-parameters and the metrics we
perform a grid-search over the subset of values of α and β .
      </p>
      <p>For initial training of new models we use data logged in the
last 3 hours. According to our experiments, such new models then
perform with acceptable initial KPIs. After the initial phase we train
the models incrementally online. During the incremental learning
we regularly retrain the models every 5 minutes where the
videouser interactions are considered for the training only when the user
did not interact with a video for at least 30 seconds. In this setting
it is possible to train a user’s updated interaction multiple times,
eg. if the user starts watching a video, pauses the video for a while
and then continues watching the video.</p>
      <p>In order to assess performance of the models, we perform A/B/n
tests on live trafic and for each variant we additionally run A/A
test. In the initial phase we use data logged from a control variant
to train all the new models. To mitigate the interference between
the variants, we train the models in the incremental phase only
from the respective A/B/n variant. We use several metrics that
should encompass both user engagement and business performance.
We measure the number of video views (Views), total time spent
watching videos (TTS) and averaged video duration (VD). The last
metric that we use is CTR, which is the total number of clicks
divided by the total number of pageviews.</p>
      <p>The logistic regression model Log modelling a probability that a
user clicks on a video is considered as a baseline model to which all
the results are compared to. In the first experiment we compare
performances of single models optimizing time metrics. These models
are trained from positive feedback only, we do not use not-clicks.
The results of the experiment are given in Table 1. All results in the
table (and in the following tables) show relative changes compared
to the values in first rows.</p>
      <p>Multiplication c t1 , = 0.8
Exponential function t(c ), = 0.7
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0.0
0.2
0.4
c 0.6
0.8
1.0
1000</p>
      <p>
        800
(a) Multiplication (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
      </p>
      <p>The models perform diferently regarding the individual metrics.
For example, Lin-TTS boosts four times longer videos than the
baseline CTR model. This results in the highest TT S and the
smallest V iews from the benchmarked models. Lin-WR × V D model
recommends on average shorter videos than Lin-TTS but still more
than two times longer than the baseline. The last model in this
experiment, Poiss-TTS, performed well in CT R and V iews but scored
the worst in TT S.</p>
      <p>
        The next two experiments display the results of compound models.
All the compound models use the Log model as c in the combination
functions (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ). Since we wanted to improve user
engagement with the emphasis on moderately long and diverse videos,
we concluded to use Lin-WR × V D as a model representing t in
the combination equations. This model gave consistent results over
time and did not emphasize too long videos compared to the rest.
The Tables 2 and 3 display the results of a grid-search over the
Model CT R V iews TT S V D
Log (α = 1) 1 1 1 1
α = 0.8 0.97 0.85 1.33 1.55
α = 0.6 0.72 0.56 1.28 2.55
α = 0.4 0.55 0.4 1.15 3.46
α = 0.2 0.44 0.3 0.95 3.88
Table 2: Results of α -compound models for various levels
of α
subset of values of hyper-parameters α and β . In both cases we can
observe concave trend in TT S metric.
      </p>
      <p>As the experiments are performed during diferent time periods,
we perform one more experiment that uses the best performing
models from the previous runs to verify our observations. This is
necessary due to the shifts in data – completely diferent sets of
videos are recommended at diferent times, user behaviour may
change substantially, etc. The best performing α -compound model
was the one with the value α = 0.8. This model has the highest
CT R, V iews and TT S. The best β -compound model is harder to select.
From TT S perspective, the best results are obtained for β = 0.4,
however CT R and V iews metrics are substantially worse than for
β = 0.7. For this reason we selected the latter model for the final
comparison. The results are given in Table 4.</p>
    </sec>
    <sec id="sec-9">
      <title>4.3 Discussion</title>
      <p>Our goal was to maximize user engagement which we represented
by the combination of metrics: TTS and CTR. Since we did not
succeed at maximizing all the metrics at once, we focused on
maximizing TTS and at the same time we required only minor drop in
performance regarding the number of Views and CTR. We argue
that if users are willing to spend more time with the service, they
are more engaged with it. Apart from that, we also took into
consideration the average length of recommended videos because our
experiments suggested that the longer the videos were, the smaller
the number of video views was. However, we aimed at increasing
TTS together with maximizing the total number of views.</p>
      <p>
        The first approach, modelling video watch time directly, did not
meet our expectations. It resulted in recommending longer videos
while decreasing all other metrics. In the second step, we
combined diferent single models’ predictions together using two
different parameterized functions (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) and created α -compound
model and β -compound model. It is important to compare models
trained and evaluated on the same time period as the results in
Tables 2, 3 and 4 suggest. The takeaway is that eventually, the best
α -compound model and the best β -compound model performed
similarly and were able to consistently and substantially boost TTS,
compared to the best performing click model.
      </p>
    </sec>
    <sec id="sec-10">
      <title>5 CONCLUSIONS AND FUTURE WORK</title>
      <p>In this paper, we demonstrated how to approach a desirable trade-of
between CTR and TTS with respect to clicks within a personalized
video recommendation system. Two diferent methods are proposed
for modelling the relationship between click probability and watch
time on a video. In addition, we proposed algorithms based on
generalized linear models suitable for online learning in an industry
setting.</p>
      <p>Models that were optimized only for watch time did not succeed
at increasing TTS. However, combining these models with the
model optimizing CTR resulted in the best-balanced results and
increased TTS by nearly 28%. Desired objective was achieved as
the combined models provide higher user engagement.</p>
      <p>Ongoing experiments using the above mentioned combined
models also show promising results for an article recommendation task
at news and lifestyle content services.</p>
      <p>As our future work we are going to experiment with diferent
models and model combinations, use hyper-parameter optimization
to achieve the best trade-of between TTS and CTR, and compare
our results to diferent competing models.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Himan</given-names>
            <surname>Abdollahpouri</surname>
          </string-name>
          , Gediminas Adomavicius, Robin Burke, Ido Guy, Dietmar Jannach, Toshihiro Kamishima, Jan Krasnodebski, and
          <string-name>
            <given-names>Luiz</given-names>
            <surname>Pizzato</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Multistakeholder recommendation: Survey and research directions. User Modeling and User-Adapted Interaction 30, 1</article-title>
          (Jan
          <year>2020</year>
          ),
          <fpage>127</fpage>
          -
          <lpage>158</lpage>
          . https://doi.org/10.1007/s11257- 019-09256-1
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Eugene</given-names>
            <surname>Agichtein</surname>
          </string-name>
          , Eric Brill, Susan Dumais, and
          <string-name>
            <given-names>Robert</given-names>
            <surname>Ragno</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Learning User Interaction Models for Predicting Web Search Result Preferences</article-title>
          .
          <source>In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          (Seattle, Washington, USA) (
          <source>SIGIR '06)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <fpage>3</fpage>
          -
          <lpage>10</lpage>
          . https://doi.org/10.1145/1148170.1148175
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Simon</given-names>
            <surname>Attfield</surname>
          </string-name>
          , Gabriella Kazai, Mounia Lalmas, and
          <string-name>
            <given-names>Benjamin</given-names>
            <surname>Piwowarski</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Towards a science of user engagement (Position Paper)</article-title>
          .
          <source>(01</source>
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Georg</given-names>
            <surname>Buscher</surname>
          </string-name>
          , Ludger van Elst,
          <string-name>
            <given-names>and Andreas</given-names>
            <surname>Dengel</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Segment-Level Display Time as Implicit Feedback: A Comparison to Eye Tracking</article-title>
          .
          <source>In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          (Boston, MA, USA) (
          <source>SIGIR '09)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <fpage>67</fpage>
          -
          <lpage>74</lpage>
          . https://doi.org/10.1145/1571941.1571955
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Bingrui</given-names>
            <surname>Geng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Lingling</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Licheng</given-names>
            <surname>Jiao</surname>
          </string-name>
          , Maoguo Gong, Qing Cai, and
          <string-name>
            <given-names>Yue</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>NNIA-RS: A multi-objective optimization based recommender system</article-title>
          .
          <source>Physica A: Statistical Mechanics and its Applications</source>
          <volume>424</volume>
          (04
          <year>2015</year>
          ). https://doi. org/10.1016/j.physa.
          <year>2015</year>
          .
          <volume>01</volume>
          .007
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Youngho</given-names>
            <surname>Kim</surname>
          </string-name>
          , Ahmed Hassan Awadallah,
          <string-name>
            <given-names>Ryen W.</given-names>
            <surname>White</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Imed</given-names>
            <surname>Zitouni</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Modeling Dwell Time to Predict Click-level Satsifaction</article-title>
          .
          <source>In The 7th Annual International ACM Conference on Web Search and Data Mining (WSDM</source>
          <year>2014</year>
          )
          <article-title>(the 7th annual international acm conference on web search and data mining (wsdm 2014) ed</article-title>
          .). ACM. https://www.microsoft.com/en-us/research/publication/ modeling-dwell
          <article-title>-time-to-predict-click-level-satsifaction/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Dmitry</given-names>
            <surname>Lagun</surname>
          </string-name>
          and
          <string-name>
            <given-names>Mounia</given-names>
            <surname>Lalmas</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Understanding User Attention and Engagement in Online News Reading</article-title>
          .
          <source>In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining</source>
          (San Francisco, California, USA) (
          <source>WSDM '16)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <fpage>113</fpage>
          -
          <lpage>122</lpage>
          . https://doi.org/10.1145/2835776.2835833
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Jiaqi</given-names>
            <surname>Ma</surname>
          </string-name>
          , Zhe Zhao,
          <string-name>
            <given-names>Xinyang</given-names>
            <surname>Yi</surname>
          </string-name>
          , Jilin Chen, Lichan Hong, and
          <string-name>
            <surname>Ed</surname>
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Chi</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Modeling Task Relationships in Multi-Task Learning with Multi-Gate Mixtureof-Experts</article-title>
          .
          <source>In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining (London, United Kingdom) (KDD '18)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <fpage>1930</fpage>
          -
          <lpage>1939</lpage>
          . https: //doi.org/10.1145/3219819.3220007
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Mario</given-names>
            <surname>Rodriguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Christian</given-names>
            <surname>Posse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Ethan</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Multiple Objective Optimization in Recommender Systems</article-title>
          .
          <source>In Proceedings of the Sixth ACM Conference on Recommender Systems</source>
          (Dublin, Ireland) (
          <source>RecSys '12)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <fpage>11</fpage>
          -
          <lpage>18</lpage>
          . https://doi.org/10.1145/ 2365952.2365961
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Ryen</surname>
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>White</surname>
            and
            <given-names>Diane</given-names>
          </string-name>
          <string-name>
            <surname>Kelly</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>A Study on the Efects of Personalization and Task Information on Implicit Feedback Performance</article-title>
          .
          <source>In 15th Annual ACM CIKM Conference on Information and Knowledge Management (CIKM</source>
          <year>2006</year>
          ), November 5-
          <issue>11</issue>
          ,
          <year>2006</year>
          , Arlington,
          <string-name>
            <surname>Virginia,</surname>
          </string-name>
          <article-title>USA (15th annual acm cikm conference on information and knowledge management (cikm</article-title>
          <year>2006</year>
          ), november
          <fpage>5</fpage>
          -
          <lpage>11</lpage>
          ,
          <year>2006</year>
          , arlington, virginia, usa ed.).
          <fpage>297</fpage>
          -
          <lpage>306</lpage>
          . https://www.microsoft.com/en-us/research/publication/study-efectspersonalization
          <article-title>-task-information-implicit-feedback-performance/</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Xing</surname>
            <given-names>Yi</given-names>
          </string-name>
          , Liangjie Hong, Erheng Zhong, Nanthan Nan Liu, and
          <string-name>
            <given-names>Suju</given-names>
            <surname>Rajan</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Beyond Clicks: Dwell Time for Personalization</article-title>
          .
          <source>In Proceedings of the 8th ACM Conference on Recommender Systems (Foster City</source>
          , Silicon Valley, California, USA) (
          <source>RecSys '14)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <fpage>113</fpage>
          -
          <lpage>120</lpage>
          . https://doi.org/10.1145/2645710.2645724
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Zhe</surname>
            <given-names>Zhao</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Lichan</given-names>
            <surname>Hong</surname>
          </string-name>
          , Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Kumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, and Ed Chi.
          <year>2019</year>
          .
          <article-title>Recommending What Video to Watch next: A Multitask Ranking System</article-title>
          .
          <source>In Proceedings of the 13th ACM Conference on Recommender Systems</source>
          (Copenhagen, Denmark) (
          <source>RecSys '19)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <fpage>43</fpage>
          -
          <lpage>51</lpage>
          . https://doi.org/10.1145/3298689.3346997
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>