<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Emotional Reactions Prediction of News Posts</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Anastasia Giachanou</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Faculty of Informatics, Universita` della Svizzera italiana</institution>
          ,
          <addr-line>Lugano</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>ISTI-CNR</institution>
          ,
          <addr-line>Pisa</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>PRHLT Research Center, Universitat Polite`cnica de Vale`ncia</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>28</fpage>
      <lpage>30</lpage>
      <abstract>
        <p>Nowadays, on-line news agents post news articles on social media platforms with the aim to attract more users. Different types of news trigger different emotions on users who may feel surprised or sad after reading some piece of news. In this paper, we are interested in predicting the amount of emotional reactions triggered on users after reading a news post. To address the problem, we propose a model that is trained on features extracted from users' early commenting activity. Our results show that users' early activity features are very important and that combining those features with terms can effectively predict the amount of emotional reactions triggered on users by a news post.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Social media platforms such as Facebook and Twitter allow news agents to post news
articles online which are accessible to users to read, comment or express their opinion
about them. Some of the news articles trigger a large amount of emotional reactions
whereas others do not. Predicting the amount of emotional reactions is a very important
problem for dealing with the problem of information overload. For example, a system
that can predict the amount of emotional reactions that are triggered by news articles
allows a user to filter the articles she would like to read based not only on the articles’
content but also on the emotions they trigger.</p>
      <p>
        Predicting the amount of triggered emotional reactions is not a trivial problem.
Network properties such as the structure of the platform or other external factors such
as user’s location may affect the reactions of the users towards a specific news post.
Intuitively, content is one of the most important factors that influences the emotional
reactions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] since there are certain terms that convey sentiment and emotion.
      </p>
      <p>
        A related problem to emotional reactions prediction is the online content
popularity prediction. Most of prior work was based on early-stage measurements, whereas
little effort has been given on the pre-publication prediction [
        <xref ref-type="bibr" rid="ref3 ref4">4, 3</xref>
        ]. Bandari et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
tackled the task as both regression and classification, and reached the conclusion that
the prediction is feasible without any early activity signals. However, recently Arapakis
et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] extended the work of [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and showed that predicting the popularity of news
articles prior to their publication is not yet a viable task.
      </p>
      <p>
        A closely related work is perfromed by Clos et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] who proposed a unigram
mixture model to create an emotional lexicon which was then used to predict the
probabilities of different emotional reactions. More recently, Giachanou et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] focused
on predicting the amount of emotional reactions triggered on users. However, they only
explored pre-publication features including content based similarities and frequencies
of entities.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Emotional Reactions Prediction</title>
      <p>The problem of emotional reactions’ amount prediction of news posts published on a
social network is defined as: Given a news article post and data about early activity, the
task consists in predicting the amount of emotional reactions that the post will trigger
on users. Note that our aim is to classify a news post with regards to the amount of the
emotional reactions (e.g., love, surprise, joy, sadness, anger) it will trigger on users. We
address the problem as a 3-class task. Given a news post we assign to it one of the labels
low, medium, high that refer to the amount of each emotional reaction that the post will
trigger.
2.1</p>
      <p>Features
Intuitively, the content of the post is very important for predicting if a news article will
trigger a high number of a certain emotional reaction. To this end, in our study we start
with terms. Furthermore, we extract features from users’ early commenting activity to
investigate if there are temporal patterns in commenting activity.</p>
      <p>
        Frequencies. The simplest textual feature is the terms that the news post contains.
Although this is a simple feature, it is one of the most important features for news
articles’ popularity prediction [
        <xref ref-type="bibr" rid="ref1 ref9">1, 9</xref>
        ] as well as similar information retrieval tasks [
        <xref ref-type="bibr" rid="ref2 ref7">2, 7</xref>
        ].
We use the bag-of-words representation to model the terms. Each term in the vector
is weighted using the term frequency-inverse document frequency (TF-IDF) approach
that considers how important is the term in a corpus. In the rest of the paper, we use
terms to refer to the TF-IDF representation of the terms.
      </p>
      <p>Commenting Activity. Once a news post is published on a social network, the users
are allowed to publish their comments about the specific post. These comments which
are published below the news post are very important because they are an early indicator
of users’ interest and reaction regarding the news post. We use the activity of users
in publishing comments regarding the news post to extract our early activity features
regarding three time range scenarios: 10, 20, and 30 minutes after the publication of the
news article. We use the following features:
1. First comment: time difference in seconds between publication date of the news
post and the first comment, if the first comment is published within the specified
time range.
2. Number of comments: number of comments published within the specified time
range.
3. Commenting ratio: mean time of commenting for those published within the
specified time range.</p>
    </sec>
    <sec id="sec-3">
      <title>Experimental Setup</title>
      <p>
        We used the same dataset as in Giachanou et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] which contains news posts from The
New York Times group in Facebook together with the amount of 5 different emotional
reactions: love, surprise, joy, sadness, and anger for each post. The collection consists
of 26,560 news posts that span from April 2016 to September 2017. We used a 10-fold
cross validation to perform the experiments. We kept training and test sets separate.
      </p>
      <p>
        We performed a 3-class classification task according to which a news post can get
one of the following labels: low, medium, high. We predicted the amount level of the
following emotional reactions: love, surprise, joy, sadness, and anger, which were
addressed individually. For all the expirements, we used the Random Forest classifier. We
report F1 score for each emotional reaction. We compare our results with terms that is
based only on the terms of the posts and the All (+terms) that is based on the approach
proposed in Giachanou et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Significance is measured with the McNemar test.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Results and Discussion</title>
      <p>
        From the results in Table 1 we observe that terms are better predictors compared to
using only the early activity. This suggests that for the specific task terms contain more
predictive power compared to early activity, that is considered the most important
feature for popularity prediction [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. When the early activity features are used alone, the
best performance is obtained for joy. In addition, we observe that for the emotions
surprise and joy the difference between terms and early activity is smaller compared to the
rest of the emotions. Indeed, in case of joy, earlyt=30 obtains a slightly worse
performance compared to terms. One possible explanation is that in case of news that trigger
joy and surprise, users post more comments compared to the rest of emotions.
https://www.facebook.com/nytimes/
      </p>
      <p>Giachanou et al.</p>
      <p>Table 1 shows that, in most of the cases, the performance improves when the time
range is increased. The only exception is the reaction love for which the performance
slightly decreases. For some emotions (e.g., surprise), the improvement is little, whereas
for other emotions (e.g., anger) the improvement is larger. However, we expect that
extracting features from even the first ten minutes is very useful for the prediction while
keeping the advantage of quick access after the post is published. Finally, Table 1 shows
that combining terms with early commenting activity is the most effective approach and
leads to significant improvements over both terms and All (+terms) approaches.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work</title>
      <p>In this study, we presented a methodology for predicting the amount of emotional
reactions that will be triggered towards a specific news post. Our results suggested that
early commenting activity is very important for the emotional prediction task. However,
terms contain more predictive power compared to using only early activity predictors.
More importantly, we showed that models trained on both terms and early commenting
activity can effectively address the problem.</p>
      <p>As future work, we plan to address the task as an ordinal classification or a
regression problem and we will try to predict the exact number of each emotional reaction.
Acknowledgments. This research was partially funded by the Swiss National Science
Foundation (SNSF) under the project OpiTrack.</p>
      <p>The work of the second author was partially funded by the the Spanish MINECO
under the research project SomEMBED (TIN2015-71147-C2-1-P).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alam</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Celli</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stepanov</surname>
            ,
            <given-names>E.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riccardi</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The social mood of news: Selfreported annotations to design automatic mood detection systems</article-title>
          .
          <source>In: PEOPLES '16</source>
          . pp.
          <fpage>143</fpage>
          -
          <lpage>152</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Aliannejadi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crestani</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Venue suggestion using social-centric scores</article-title>
          . CoRR abs/
          <year>1803</year>
          .08354 (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Arapakis</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cambazoglu</surname>
            ,
            <given-names>B.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lalmas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>On the feasibility of predicting popular news at cold start</article-title>
          .
          <source>Journal of the Association for Information Science and Technology</source>
          <volume>68</volume>
          (
          <issue>5</issue>
          ),
          <fpage>1149</fpage>
          -
          <lpage>1164</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bandari</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Asur</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huberman</surname>
            ,
            <given-names>B.A.</given-names>
          </string-name>
          :
          <article-title>The pulse of news in social media: Forecasting popularity</article-title>
          .
          <source>In: ICWSM '12</source>
          . pp.
          <fpage>26</fpage>
          -
          <lpage>33</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Clos</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bandhakavi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiratunga</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cabanac</surname>
          </string-name>
          , G.:
          <article-title>Predicting emotional reaction in social networks</article-title>
          .
          <source>In: ECIR '17</source>
          . pp.
          <fpage>527</fpage>
          -
          <lpage>533</lpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Giachanou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mele</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crestani</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Emotional influence prediction of news posts</article-title>
          .
          <source>In: ICWSM'18</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Paltoglou</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giachanou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Opinion Retrieval: Searching for Opinions in Social Media</article-title>
          , pp.
          <fpage>193</fpage>
          -
          <lpage>214</lpage>
          . Springer International Publishing (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Shulman</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sharma</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cosley</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Predictability of popularity: Gaps between prediction and understanding</article-title>
          .
          <source>In: ICWSM '16</source>
          . pp.
          <fpage>348</fpage>
          -
          <lpage>357</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Tsagkias</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weerkamp</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Rijke</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Predicting the volume of comments on online news stories</article-title>
          .
          <source>In: CIKM '09</source>
          . pp.
          <fpage>1765</fpage>
          -
          <lpage>1768</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>