<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Bearish-Bullish Sentiment Analysis on Financial Microblogs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Amna Dridi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mattia Atzeni</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Diego Reforgiato Recupero</string-name>
          <email>diego.reforgiatog@unica.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Cagliari, Mathematics and Computer Science Department</institution>
          ,
          <addr-line>Via Ospedale 72, 09124, Cagliari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>User-generated data in blogs and social networks has recently become a valuable resource for sentiment analysis in the nancial domain since it has been shown to be extremely signi cant to marketing research companies and public opinion organizations. In this paper a ne-grained approach is proposed to predict a real-valued sentiment score. We use several feature sets consisting of lexical features, semantic features and combination of lexical and semantic features. To evaluate our approach a microblog messages dataset is used. Since our dataset includes con dence scores of real numbers within the [0-1] range, we compare the performance of two learning methods: Random Forest and SVR. We test the results of the training model boosted by semantics against classi cation results obtained by n-grams. Our results indicate that our approach succeeds in performing the accuracy level of more than 72% in some cases.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Sentiment analysis in nancial domain is becoming more and more a big concern
for businesses, organizations and marketing researchers, mainly due to their high
subjectivity as users express freely their opinions through opinionated sentences,
contrary to news articles which are known by their objectivity and implicit
opinions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Both lexicon-based [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ] and machine learning methods [
        <xref ref-type="bibr" rid="ref1 ref4">4, 1</xref>
        ] have been used
for mining user's opinion in the nancial domain. Most of lexicon-based methods
have focused on the coarse-grained analysis of sentiment expressed in text.
However, coarse-grained methods are insu cient for the detection and polarity
classi cation of sentiment expressed about companies in nancial news text as not
all expressions of sentiment are related to the company we are interested in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
To tackle this problem, machine learning techniques have been recently
proposed [
        <xref ref-type="bibr" rid="ref1 ref5 ref6">1, 5, 6</xref>
        ] that mainly investigated ne-grained schema to allow pinpointing
the particular phrases in a text express sentiment and analyzing these sentiment
expressions in a ne-grained manner.
      </p>
      <p>
        Both approaches of research in sentiment analysis in the nancial domain are
still too much focused on word occurrence methods and they seldom even use
WordNet [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], ignoring consequently advancements of techniques in semantics.
However, semantics is crucial to text classi cation problem. From this
perspective this work lies at the intersection of NLP, Semantic Web and sentiment
analysis which are recently being increasingly researched for many emerging needs,
such as the nancial one. There have been some early-stage e orts to integrate
a semantic abstraction layer in the nancial domain [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. However, no previous
studies have focused on investigating Semantic Web in sentiment analysis in the
nancial domain. In this research work, we aim to ll this gap. We believe that
by grasping common-sense knowledge bases and semantic networks this study
adds a deep understanding of sentiments and opinions from natural language
expressed by means of user-generated data. By using Framester [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], as a wide
coverage hub of linguistic linked data standardized using frame semantics, this
work also adds breadth to the debate on the strengths of using semantics for
sentiment analysis in the nancial domain. Additionally, by focusing solely on
user-generated texts, rather than on traditional texts such as news papers, and
testing our ne-grained sentiment approach on a collection of nancially
relevant microblog messages from Twitter1 and Stocktwits2, this work enriches the
knowledge base of nancial user-generated data. Finally, by training two machine
learning classi ers, boosting the training model by semantics through
replacement and augmentation, and using Apache Spark to deal with user-generated
big data, this research shows that the accuracy of ne-grained polarity detection
in nancial domain when using semantic features is slightly better in term of
cosine similarity score comparing to the baseline in the microblogs dataset. To
the best of our knowledge the proposed approach represents the rst attempt
towards harnessing Semantic Web in sentiment analysis in the nancial domain.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Sentiment analysis in the nancial domain has been applied for a wide range
of economic and nancial elds [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], such as market prediction [
        <xref ref-type="bibr" rid="ref10 ref11 ref8">8, 10, 11</xref>
        ], box
o ce prediction for movies [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], analyzing consumer's attitudes towards certain
brands [
        <xref ref-type="bibr" rid="ref2 ref3">3, 2</xref>
        ], determining the nancial blogger's sentiment towards companies
and their stock [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Both lexicon-based [
        <xref ref-type="bibr" rid="ref2 ref3">3, 2</xref>
        ] and machine learning methods [
        <xref ref-type="bibr" rid="ref1 ref4">1, 4</xref>
        ] have been used.
Mostafa [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], for instance, has used an expert-prede ned lexicon including around
6800 seed adjectives with known orientation to conduct the analysis of consumer
brand sentiments. He has shown that his study added breadth and depth to
the debate over attitudes towards cosmopolitan brands. In the same context,
Ghiassi et al. (2013) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] have developed a Twitter-speci c lexicon for sentiment
analysis and augmented it with brand-speci c terms for brand-related tweets
in order to perform Twitter brand sentiment analysis. They have shown that
the reduced lexicon set, while signi cantly smaller (only 187 features), reduces
modeling complexity, maintains a high degree of coverage over their Twitter
corpus, and yields improved sentiment classi cation accuracy.
      </p>
      <sec id="sec-2-1">
        <title>1 https://twitter.com/ 2 stocktwits.com/</title>
        <p>
          On the other hand, Ferguson et al. (2009) [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] have explored the use of
paragraph-level and document-level annotations, examining how additional
information from paragraph-level annotations can be used to increase the accuracy
of document-level sentiment classi cation. Similarly, O'Hare et al. (2009) [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] have
proposed and evaluated simple text-extraction approaches to extract most
relative segments of a document with respect to a given topic. Then, they have
trained and tested sentiment classi ers on the extracted sub-document
representation (word-, sentence-, and paragraph-text extraction).
        </p>
        <p>
          As far as it has been reported, many of the current works of research in
sentiment analysis in nancial domain are still too much focused on word occurrence
methods and they rarely even use WordNet [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], ignoring consequently
advancements of techniques in semantics. However, semantics is crucial to the text
classication problem. Following this trend, Khadjeh Nassirtoussi et al. (2015) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ],
recently have proposed a novel approach to predict intraday directional-movements
of currency-pair in the foreign exchange market based on the text of breaking
nancial news-headlines using a semantic abstraction layer that addresses the
problem of co-reference in text mining. Their work produces selection which
creates a way to recognize words with the same parent-word to be regarded as one
entity.
        </p>
        <p>
          The work we present in this paper lies within this context of semantics
investigation for sentiment analysis in nancial domain. But, going beyond them
and in addition to co-reference resolution, we aim at using a wide coverage
linguistic resources such as FrameNet [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], WordNet [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], BabelNet [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], and others
to leverage semantics to more accurate sentiment analysis following the novel
sentic computing system presented in [
          <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
          ] that combines natural language
processing techniques with knowledge representation. This leads to better
exploitation of both computer and human sciences to better interpret and process
user-generated data in nancial domain.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Fine-grained sentiment analysis</title>
      <p>The aim of our approach is to take microblog messages as input and predict
the sentiment score of each of the companies or stocks mentioned in the text
instance. Sentiment values need to be oating point values in the range of 1 (very
negative/bearish) to 1 (very positive/bullish), with 0 designating neutral
sentiment. This prediction is realized by making a decision on assigning a real-valued
score to the overall sentiment in order to provide precise, ne-grained
assessments of sentiment in the nancial text. In other words, the role of machine
learning techniques in our approach is predicting the score given by the
annotator. These methods are supervised and, therefore, require a training dataset of
their learning stage. For the learning stage, a feature selection task is required.
3.1</p>
      <sec id="sec-3-1">
        <title>Feature selection</title>
        <p>For each microblog message, a feature-vector is prepared. Our features can be
divided into three main categories which are lexical features (n-grams), semantic
features (BN synsets and semantic frames) and a combination of the lexical and
semantic features.</p>
        <p>Lexical features. In this work, we use word n-grams as lexical features. The
process of n-grams extraction is preceded by a step of text tokenization and
stopword removal. At rst, the text of the grouped microblog messages is tokenized
and lemmatized using Stanford coreNLP. Then, the stop-words are removed
using Stanford coreNLP stop-word list3. From this standard stop-word list, we
removed the two words "up" and "down" since they are important keywords
in the nancial domain that represent sentiment towards stocks and companies.
For instance, our dataset contains a lot of messages like "up almost 11% now". It
is clear here that the word "up" is the keyword that gives important information
about the sentiment of this sentence.</p>
        <p>After tokenization and stop-word removal, we create the lexical feature-vector
for each text instance in our dataset. The vector contains (i) unigrams that
are resulted after the lemmatization step realized by Stanford coreNLP, (ii)
bigrams and 3-grams that are given by Apache Spark APIs, in particular the
class org.apache.spark.ml.feature.NGram4.</p>
        <p>Semantic features. The semantic features correspond to the semantic frames
and the BabelNet synsets returned by Framester for each microblog message.
Semantic frames and BabelNet synsets have been extracted using the pro le b
of the Framester APIs.</p>
        <p>
          { BabelNet synsets are sets of synonyms in di erent languages grouped by
BabelNet which is an encyclopedic dictionary that provides concepts and named
entities lexicalized in many languages and connected with large amounts of
semantic relations, automatically created by linking Wikipedia5 to
WordNet [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
{ Semantic Frames are a collection of facts that specify "characteristic
features, attributes, and functions of a denotatum, and its characteristic
interactions with things necessarily or typically associated with it" [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>We use a semantic replacement method to incorporate semantic features
into the classi er. Semantic replacement means replacing lexical features
(ngrams) by semantic features (BN synsets and/or semantic frames). In other
words, instead of using the textual representation of a microblog message, we
substitute it by BN synsets, semantic frames or both of them (BN synsets+semantic
frames).</p>
        <sec id="sec-3-1-1">
          <title>3 https://github.com/stanfordnlp/CoreNLP/blob/</title>
          <p>master/data/edu/stanford/nlp/patterns/surface/stopwords.txt
4 https://spark.apache.org/docs/1.5.1/api/java/org/apache/spark/ml/feature/NGram.html
5 http://www.wikipedia.org/</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Combination of lexical and semantic features. This consists in augmenting</title>
        <p>the original n-grams feature space (lexical features) with the semantic features
(BN synsets and semantic frames) as additional features for the classi er training
in three di erent ways: (i) augment the original lexical features (n-grams) with
semantic frames, (ii) augment lexical features with BabelNet synsets, and (iii)
augment lexical features with both semantic frames and BabelNet synsets.</p>
        <p>The size of the vocabulary in this case is enlarged by the introduced semantic
features. In other words, we use a semantic augmentation method to
incorporate semantic features into the classi er. This means instead of using only the
textual representation of the microblog message, we augment it by BN synsets
and/or semantic frames.
3.2</p>
      </sec>
      <sec id="sec-3-3">
        <title>Sentiment score granularity</title>
        <p>We propose to use SVM regression to conduct the quantitative sentiment score
by performing sentiment analysis on a real-valued scale. To do so, at rst it
crucial to realize that the extracted features above have di erent levels of impact
in terms of the sentiment that they entail. Therefore, we propose to represent
features in a scaled manner by TF.IDF. Then, it is important to determine the
positively and negatively correlated words because the algorithms we will use
learn to predict the score of a text instance from microblogs based solely on
presence/absence of words in the text instance.</p>
        <p>
          Word-score correlation metric. In order to determine the positively and
negatively correlated words, we use the word-score correlation metric presented
in [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. We note that a word could be unigram, bi-gram or 3-gram.
        </p>
        <p>The correlation of a word w with the scores of a set of nancial messages M ,
denoted c(w; M ), is de ned by the following:
1</p>
        <p>X
jM j m2M</p>
        <p>(
c(w; M ) =</p>
        <p>I(w; m)
(S(m)</p>
        <p>1
jM j m02M</p>
        <p>X S(m0)
!)
(1)
where S(m) is the real-valued score associated with message m and I(w; m) is
a function that outputs 1 if m contains word w and outputs 1 otherwise. Note
that the jM1 j Pm02M S(m0) term is the average message score. Intuitively, if a
word is positively correlated with message scores then it would tend to appear in
documents with above average scores and be absent from messages with below
average scores. Similarly, if a word is negatively correlated with message scores
then it would tend to appear in documents with below average scores and be
absent from messages with above average scores.</p>
        <p>To see how this applies in the correlation metric de ned above, notice that
if a word w appears in a message m and ms score is above average, then both
I(w; m) and (S(m) jM1 j Pm02M S(m0) are positive, and the correlation goes
up. If w does not appear in message m and ms score is below average, then both
terms are negative, and again the correlation goes up. Meanwhile, in the other
two cases (when w is not in m and ms score is above average and when w is
in m and ms score is below average), the terms have di erent signs and the
correlation drops.</p>
        <p>This metric reveals how much a words presence/absence tends to cause a
messages score to deviate from the mean on average. A large positive value
indicates that the word tends to occur in reviews with above average scores
and be absent from messages with below average scores, while a large negative
value indicates the opposite. A value near 0 indicates that the words presence
does not tend to in uence the score signi cantly in either a positive or negative
direction. This metric implicitly tends to remove words that occur too rarely or
too frequently to be useful for learning.
4
4.1</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Evaluation</title>
      <sec id="sec-4-1">
        <title>Financial data description</title>
        <p>The microblog messages dataset consists of a collection of nancially relevant
microblog messages from Twitter and Stocktwits which have been annotated
for ne-grained sentiment analysis. The dataset identi es bullish (optimistic;
believing that the stock price will increase) and bearish (pessimistic; believing
that the stock price will decline) sentiment associated with companies and stocks
in a ne-grained manner. The total number of microblog messages is 1694, with
1086 positives, 581 negatives and 27 neutral.</p>
        <p>Each message in the dataset is annotated with the following information:
source which presents the platform where the message was posted, either
Twitter or Stocktwits, id which identi es the unique Twitter or StockTwits ID of the
message, cashtag which identi es the stock ticker symbol that the sentiment
and span relate to. For example, "$amzn" is a cashtag related to Amazon,
sentiment which is a oating point value between 1 (very bearish/negative) and
1 (very bullish/positive) denoting the sentiment expressed towards cashtag. 0
denotes neutral sentiment, and spans which is a list of strings from the message
which express sentiment.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Evaluation methodology</title>
        <p>We have carried out the experiments twice: the rst time using the whole text
of the message and the second time using only the spans related to the message
which are de ned as a list of strings from the message that express sentiment.</p>
        <p>We have considered the di erent feature representations for each microblog
message outlined in Section 3.1. We have considered the rst feature
representation which represents the lexical features (n-grams) as a baseline and we have
compared the accuracy obtained by constructing a classi er trained on each
feature.</p>
        <p>We have compared two classi ers: a decision tree classi er (Random Forest)
and a Support Vector Regression (SVR) classi er. The Apache Spark machine
learning library MLlib implementation was used for the rst classi er, while we
used Weka machine learning library for the second classi er.</p>
        <p>Ten-fold cross validation was used for each of the segmentation experiments,
with the results averaged over the ten folds. We use cosine similarity as the
performance metric.</p>
        <p>
          As the sentiment score predicted by the learned classi ers lie on a continuous
scale between 1 and 1, cosine distance enables comparing the degree of
agreement between gold standard and predicted results. At the same time, while not
requiring exact correspondence between the gold and predicted score, a given
instance does not need to be identical in order to achieve a good evaluation
result. The scores are conceptualized as vectors, where each dimension represents
a stock symbol or company within a given microblog message or headline . Note
that the both vectors have the same number of dimensions as the stock symbols
and companies for which sentiment needs to be assigned was given in the input
data [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ].
        </p>
        <p>Cosine similarity is calculated according the following equation, where G is
the vector of gold standard scores and P is the vector of scores predicted by the
classi er:
cosine(G; P ) =
pPn
i=1 Gi2
Pn
i=1 Gi</p>
        <p>
          Pi
pPn
i=1 Pi2
(2)
(3)
In order to reward classi ers which attempt to answer all problems in the gold
standard, the nal score is obtained by weighting the cosine from Equation 2
with the ratio of answered problems (scored instances), given below (as given
in [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]).
        </p>
        <p>cosine weight = jP j
jGj
The equation for the nal score is the product of the cosine and the weight,
given below:
f inal cosine score = cosine weight
cosine(G; P )
(4)
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Results</title>
        <p>Fine-grained classi cation (Random Forest) and regression (SVR) results
using lexical-based, semantic-based and combination of lexical and semantic-based
features for our dataset are shown in Table 1.</p>
        <p>
          Table 1 shows cosine similarity scores related to the microblog messages
related to the whole message text as well as those related to spans. The obtained
results demonstrate the e ectiveness of the spans comparing to the whole
message text as the granularity of the sentiment is more accurate with this list of
strings that capture sentiments in microblog message. The spans e ectiveness is
shown by comparing the results of Microblogs-Text and Microblogs-Spans rows
where the accuracy of spans outperforms the accuracy of the messages text in
each row; for the two algorithms and with the 7 features, notably by more than
12% with SVR algorithm using n-grams. This substantial improvement from the
text-level classi cation to the sentence-level (spans) classi cation underlines the
importance of the text extraction techniques in ne-grained sentiment analysis.
Interestingly, the results indicate that is possible to achieve large improvements
over message-based sentiment classi cation using quite simple text-extraction
approaches to extract the most relevant segments of the messages. For the
semantic incorporation, our experimental results show that the integration of
semantic features performs better than simply using lexical features (n-grams) for
SVR, but not for Random Forest. Our baseline (n-grams) keeps the best
performance. This could be justi ed by the principle of decision tree algorithms
where the rules are composed of words, and words have meaning, then the rules
themselves can be insightful. More than just attempting to assign a label, a set
of decision rules may suggest a pattern of words found in newswire prior to the
rise of a stock price. The downside of rules is that they can be less predictive
if the underlying concept is complex [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Even the baseline (n-grams) gives the
best results with Random Forest (0:680 in microblog spans), the high accuracy
is given when semantic features are introduced (0:726 in microblog spans with
n-grams+BN synsets+semantic frames).
        </p>
        <p>For overall experimental results,semantic integration with enrichment either
by BN synsets (n-grams+ BN synsets) or by both BN synsets and semantic
frames(n-grams+BN synsets+semantic frames) gives better results. For instance,
for Microblogs-Text the best cosine similarity is reached when n-grams were
enriched by BN synsets, passing from 0:663 with n-grams to 0:677 when BN
synsets are incorporated, giving a gain in accuracy of more than 2%. In the
microblog spans dataset the best accuracy is given when n-grams were enriched
by both BN synsets and semantic frames, passing from 0:712 to 0:726 giving
again a gain in accuracy of approximately 2%.</p>
        <p>Noteworthy is the fact that the SVR algorithm is the top performer in all
experiments. This shows that regression approach for ne-grained sentiment
analysis will likely be best.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>Sentiment analysis in the nancial domain using user-generated data is
challenging. This work addressed this challenge in an accurate way by bringing together
natural language processing and Semantic Web as well as ne-grained sentiment
analysis to propose an approach that predicts a real-value sentiment score of
each of the companies or stocks mentioned in the text instance of microblog
messages.</p>
      <p>We have considered three main categories of features: lexical features,
semantic features and a combination of the lexical and semantic features. Then,
using these features, we have compared the performance of two learning
methods: one classi cation-based and one regression-based algorithms. The approach
succeeded in performing the accuracy level of more than 72% in some cases
when the training model was boosted by semantics through replacement and
augmentation.</p>
      <p>For our dataset, we have performed two types of experiments; one using
the whole text of microblog messages and the other one using only spans.
Interestingly, our results indicated that spans performs signi cantly better than
the whole text. This indicates that is possible to achieve large improvements
over message-based sentiment classi cation using quite simple text-extraction
approaches to extract the most relevant segments of the messages. In our dataset,
these segments are already given in form of list of strings expressing sentiments
and called spans. However, the approach is interesting and could be investigated
in future work by developing techniques to extract most relevant segments for
sentiment classi cation over di erent text levels.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work has been supported by Sardinia Regional Government (P.O.R. Sardegna
F.S.E. Operational Programme of the Autonomous Region of Sardinia,
European Social Fund 2014-2020 - Axis IV Human Resources, Objective l.3, Line of
Activity l.3.1.).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>O</given-names>
            <surname>'Hare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Davy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Bermingham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Ferguson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Sheridan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Gurrin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Smeaton</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.F.</surname>
          </string-name>
          :
          <article-title>Topic-dependent sentiment analysis of nancial blogs</article-title>
          .
          <source>In: Proceedings of the 1st International CIKM Workshop on Topic-sentiment Analysis for Mass Opinion. TSA '09</source>
          , New York, NY, USA, ACM (
          <year>2009</year>
          )
          <volume>9</volume>
          {
          <fpage>16</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Ghiassi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skinner</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zimbra</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic arti cial neural network</article-title>
          .
          <source>Expert Syst. Appl</source>
          .
          <volume>40</volume>
          (
          <issue>16</issue>
          ) (
          <year>November 2013</year>
          )
          <volume>6266</volume>
          {
          <fpage>6282</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Mostafa</surname>
            ,
            <given-names>M.M.:</given-names>
          </string-name>
          <article-title>More than words: Social networks' text mining for consumer brand sentiments</article-title>
          .
          <source>Expert Syst. Appl</source>
          .
          <volume>40</volume>
          (
          <issue>10</issue>
          ) (
          <year>August 2013</year>
          )
          <volume>4241</volume>
          {
          <fpage>4251</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Paul</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neil</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michael</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adam</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scott</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paraic</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cathal</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alan</surname>
            ,
            <given-names>F.S.:</given-names>
          </string-name>
          <article-title>Exploring the use of paragraph-level annotations for sentiment analysis of - nancial blogs</article-title>
          .
          <source>In: Proceedings of the 1st workshop on opinion mining and sentiment analysis (WOMSA</source>
          <year>2009</year>
          ).
          <source>WOMSA</source>
          <year>2009</year>
          (
          <year>2009</year>
          )
          <volume>42</volume>
          {
          <fpage>52</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Van de Kauter,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Breesch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Hoste</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          :
          <article-title>Fine-grained analysis of explicit and implicit sentiment in nancial news articles</article-title>
          .
          <source>Expert Syst. Appl</source>
          .
          <volume>42</volume>
          (
          <issue>11</issue>
          ) (
          <year>July 2015</year>
          )
          <volume>4999</volume>
          {
          <fpage>5010</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Raina</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Sentiment analysis in news articles using sentic computing</article-title>
          .
          <source>In: Proceedings of the 2013 IEEE 13th International Conference on Data Mining Workshops. ICDMW '13</source>
          , Washington, DC, USA, IEEE Computer Society (
          <year>2013</year>
          )
          <volume>959</volume>
          {
          <fpage>962</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Fellbaum</surname>
          </string-name>
          , C., ed.:
          <article-title>WordNet: an electronic lexical database</article-title>
          . MIT Press (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Khadjeh</given-names>
            <surname>Nassirtoussi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Aghabozorgi</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          , Ying Wah,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Ngo</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.C.L.</surname>
          </string-name>
          :
          <article-title>Text mining of news-headlines for forex market prediction</article-title>
          .
          <source>Expert Syst. Appl</source>
          .
          <volume>42</volume>
          (
          <issue>1</issue>
          ) (
          <year>January 2015</year>
          )
          <volume>306</volume>
          {
          <fpage>324</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Gangemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alam</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Asprino</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Presutti</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Recupero</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          :
          <article-title>Framester: A wide coverage linguistic linked data hub</article-title>
          .
          <source>In: Knowledge Engineering and Knowledge Management - 20th International Conference, EKAW 2016</source>
          , Bologna, Italy,
          <source>November 19-23</source>
          ,
          <year>2016</year>
          , Proceedings. (
          <year>2016</year>
          )
          <volume>239</volume>
          {
          <fpage>254</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Khadjeh</given-names>
            <surname>Nassirtoussi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Aghabozorgi</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          , Ying Wah,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Ngo</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.C.L.</surname>
          </string-name>
          :
          <article-title>Review: Text mining for market prediction: A systematic review</article-title>
          .
          <source>Expert Syst. Appl</source>
          .
          <volume>41</volume>
          (
          <issue>16</issue>
          ) (
          <year>November 2014</year>
          )
          <volume>7653</volume>
          {
          <fpage>7670</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Sprenger</surname>
            ,
            <given-names>T.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tumasjan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sandner</surname>
            ,
            <given-names>P.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Welpe</surname>
            ,
            <given-names>I.M.</given-names>
          </string-name>
          :
          <article-title>Tweets and trades: the information content of stock microblogs</article-title>
          .
          <source>European Financial Management</source>
          <volume>20</volume>
          (
          <issue>5</issue>
          ) (
          <year>2014</year>
          )
          <volume>926</volume>
          {
          <fpage>957</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Du</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Box o ce prediction based on microblog</article-title>
          .
          <source>Expert Syst. Appl</source>
          .
          <volume>41</volume>
          (
          <issue>4</issue>
          ) (
          <year>March 2014</year>
          )
          <volume>1680</volume>
          {
          <fpage>1689</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>C.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fillmore</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lowe</surname>
            ,
            <given-names>J.B.</given-names>
          </string-name>
          :
          <article-title>The berkeley framenet project</article-title>
          .
          <source>In: Proceedings of the 17th International Conference on Computational Linguistics - Volume 1. COLING '98</source>
          ,
          <string-name>
            <surname>Stroudsburg</surname>
          </string-name>
          , PA, USA, Association for Computational Linguistics (
          <year>1998</year>
          )
          <volume>86</volume>
          {
          <fpage>90</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Navigli</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ponzetto</surname>
            ,
            <given-names>S.P.</given-names>
          </string-name>
          : Babelnet:
          <article-title>The automatic construction, evaluation and application of a wide-coverage multilingual semantic network</article-title>
          .
          <source>Artif. Intell</source>
          .
          <volume>193</volume>
          (
          <year>December 2012</year>
          )
          <volume>217</volume>
          {
          <fpage>250</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Recupero</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Presutti</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Consoli</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gangemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nuzzolese</surname>
            ,
            <given-names>A.G.</given-names>
          </string-name>
          :
          <article-title>Sentilo: Frame-based sentiment analysis</article-title>
          .
          <source>Cognitive Computation</source>
          <volume>7</volume>
          (
          <issue>2</issue>
          ) (
          <year>2015</year>
          )
          <volume>211</volume>
          {
          <fpage>225</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Gangemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Presutti</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Recupero</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          :
          <article-title>Frame-based detection of opinion holders and topics: A model and a tool</article-title>
          .
          <source>IEEE Comp. Int. Mag</source>
          .
          <volume>9</volume>
          (
          <issue>1</issue>
          ) (
          <year>2014</year>
          )
          <volume>20</volume>
          {
          <fpage>30</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Allan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Natural Language Semantics</article-title>
          . Wiley (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Drake</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ringger</surname>
            ,
            <given-names>E.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ventura</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Sentiment regression: Using real-valued scores to summarize overall document sentiment</article-title>
          .
          <source>In: Proceedings of the 2th IEEE International Conference on Semantic Computing (ICSC</source>
          <year>2008</year>
          ),
          <year>August</year>
          4-
          <issue>7</issue>
          ,
          <year>2008</year>
          ,
          <string-name>
            <given-names>Santa</given-names>
            <surname>Clara</surname>
          </string-name>
          , California, USA. (
          <year>2008</year>
          )
          <volume>152</volume>
          {
          <fpage>157</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veale</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shutova</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barnden</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reyes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          : Semeval-2015 task 11:
          <article-title>Sentiment analysis of gurative language in twitter</article-title>
          .
          <source>In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval</source>
          <year>2015</year>
          ), Denver, Colorado, Association for Computational Linguistics (
          <year>June 2015</year>
          )
          <volume>470</volume>
          {
          <fpage>478</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>