=Paper=
{{Paper
|id=Vol-1961/paper08
|storemode=property
|title=
Sentiment analysis for real-time applications
|pdfUrl=https://ceur-ws.org/Vol-1961/paper08.pdf
|volume=Vol-1961
|authors=Javi Fernández
}}
==
Sentiment analysis for real-time applications
==
Sentiment Analysis for Real-time Applications∗
Análisis de sentimientos para aplicaciones en tiempo real
Javi Fernández
University of Alicante
javifm@ua.es
Abstract: In this paper we present a supervised hybrid approach for Sentiment
Analysis in Real-time Applications. The main goal of this work is to design an ap-
proach which employs very few resources but obtains near state-of-the-art results.
Keywords: Sentiment Analysis, Real-time Applications, Lexicon, Machine Learn-
ing
Resumen: En este artı́culo presentamos una aproximación hı́brida supervisada para
el análisis de sentimientos para aplicaciones en tiempo real. El objetivo principal de
nuestro trabajo es diseñar una aproximación que emplee muy pocos recursos pero
que obtenga resultados cercanos al estado de la cuestión.
Palabras clave: Análisis de sentimientos, aplicaciones en tiempo real, lexicón,
aprendizaje automático
1 Introduction everybody. In addition, the quality of the ap-
proach should be near the state-of-the-art re-
Recent years have seen the birth of Social
sults. In the following sections we explain our
Networks and Web 2.0. They have facili-
approach in detail. Section 2 briefly describes
tated people to share aspects and opinions
the related work in the field and introduce
about their everyday life. This subjective in-
our work. In Section 3 we detail the approach
formation can be interesting for general users,
we propose. Finally, Section 4 concludes the
brands and organisations. However, the vast
paper, and outlines the future work.
amount of information (for example, over 500
million messages per day in Twitter1 ) compli-
2 Related Work
cates traditional sentiment analysis systems
to process this subjective information in real- Two main approaches can be followed: ma-
time. The performance of sentiment analysis chine learning and lexicon-based (Taboada et
tools has become increasingly critical. al., 2011; Medhat, Hassan, y Korashy, 2014;
The main goal of our work is to design a Mohammad, 2015; Ravi y Ravi, 2015). Ma-
sentiment analysis approach oriented to real- chine learning approaches treat polarity clas-
time applications. An approach that bal- sification as a text categorisation problem.
ances efficiency and quality. It must employ Texts are usually represented as vectors of
very few resources, in order to be able to pro- features, and depending on the features used,
cess as many texts as possible. This will also the system can reach better results. If a la-
make sentiment analysis more accessible for belled training set of documents is needed,
the approach is defined as supervised learn-
∗
This research work has been partially funded ing; if not, it is defined as unsupervised learn-
by the University of Alicante, Generalitat Valen- ing. These approaches perform very well in
ciana, Spanish Government, Ministerio de Edu- the domain they are trained on, but their per-
cación, Cultura y Deporte and Ayudas Fundación
BBVA a equipos de investigación cientı́fica 2016
formance drops when the same classifier is
through the projects TIN2015-65100-R, TIN2015- used in a different domain (Pang y Lee, 2008;
65136-C2-2-R, PROMETEOII/2014/001, GRE16- Tan et al., 2009). In addition, if the number
01: Plataforma inteligente para recuperación, análisis of features is big, the efficiency drops dramat-
y representación de la información generada por ically. Lexicon-based approaches make use of
usuarios en Internet and Análisis de Sentimientos
Aplicado a la Prevención del Suicidio en las Redes dictionaries of opinionated words and phrases
Sociales (ASAP ). to discern the polarity of a text. In these ap-
1
www.internetlivestats.com/twitter-statistics proaches, each word in the dictionary is as-
signed a score for each sentiment (e.g. pos- the dataset and the generated lexicon with
itivity and negativity). To detect the po- machine learning techniques. Its architecture
larity of a text, the scores of its words are can be seen in Figure 1. In the following sub-
combined, and the polarity with the great- sections we explain the different parts of our
est score is chosen. These dictionaries can be approach in detail.
generated manually (Tong, 2001), semiauto-
matically from an initial seed of opinionated Dataset
words (Kim, Rey, y Hovy, 2004; Baccianella,
Esuli, y Sebastiani, 2010), or automatically
Tokenisation
from a labelled dataset (Jijkoun, de Rijke, y
Weerkamp, 2010; Cruz et al., 2013). The ma-
jor disadvantage of these approaches is the in-
capability to find opinion words with domain Lexicon Generation Supervised Learning
and context specific orientations, while the
last one helps to solve this problem (Medhat, Lexicon Classifier
Hassan, y Korashy, 2014). These approaches
are usually faster than machine learning ones,
as the combination of scores is normally a Figure 1: Approach architecture
predefined mathematical function. We de-
cided to use a hybrid approach, trying to 3.1 Tokenisation
take advantage of the machine learning ap-
proach categorisation quality and the lexicon We tried to employ the minimum number of
approach speed. external linguistic tools, to minimise the pos-
Most of the current sentiment analysis ap- sible propagation of external errors, in addi-
proaches employ words, n-grams and phrases tion to the extra time they can consume. The
as information units for their models, either tokenisation process starts obtaining all the
as features for machine learning approaches, words in the text. We only extract words
or as dictionary entries in the lexicon-based containing alphabetic characters. Numbers,
approaches. However, words and n-grams punctuation symbols, or emoticons, are not
have some problems to represent the flexi- considered at this moment, but we are study-
bility and sequentiality of human language. ing the best way to include them in the fu-
This is the reason why we decided to use ture. The only external resource we employ
skipgrams. The use of skipgrams is a tech- for the tokenisation process is a stemmer to
nique whereby n-grams are formed (bigrams, obtain the most general form of the words
trigrams, etc.), but in addition to using adja- we extracted. We preferred a stemmer over
cent sequences of words, it also allows some a lemmatiser because they are much faster
words to be skipped (Guthrie et al., 2006). In (Balakrishnan y Lloyd-Yemoh, 2014) and re-
this way, skipgrams are new terms that retain quire less resources, one of the goals of our
part of the sequentiality of the terms, but in approach. Specifically, we used the Snowball 2
a more flexible way than n-grams (Fernández implementation for each language.
et al., 2014). Note that an n-gram can be de- Once we have the words in the text, we
fined as a 0-skip-n-gram, a skipgram where combine them using the skipgram modelling
k = 0. For example, the sentence “I love to obtain multiword terms. We will use two
healthy food” has two word level trigrams: “I variables in this work: n will be the maxi-
love healthy” and “love healthy food”. How- mum number of words when building a new
ever, there is one important trigram implied term with the skipgram modelling, and k will
by the sentence that was not captured: “I be the maximum number of skips. Note that
love food”. The use of skipgrams allows the n = 3 includes all the terms with 1, 2 and 3
word “health” to be skipped, providing the words, and k = 3 includes 1, 2 and 3 skips.
mentioned trigram. 3.2 Lexicon generation
3 Methodology In summary, our sentiment lexicon consists
of a list of terms for each polarity, assigning
Our contribution consists in a hybrid ap- a score indicating how strongly that term is
proach which creates a lexicon from a labelled
2
dataset and builds a polarity classifier from snowball.tartarus.org
related to that polarity. To build this lexicon, Negative Score
we need a polarity labelled dataset, which this mess .871
will provide both the terms in the lexicon and worst movie .863
their scores. There exist many term scoring is terrible .852
techniques (Yang y Pedersen, 1997; Chan- ludicrous .833
drashekar y Sahin, 2014), and the majority waste .818
of them employ probabilities to calculate the
scores. However, they take full advantage of Positive Score
the skipgram modelling, because they give outstanding .862
the same importance to terms where words is terrific .826
were adjacent, than to those where the words finest .823
were not adjacent (we skipped some of them). breathtaking .803
Because of this, we created our custom scor- is excellent .795
ing formula.
First, we will describe our counting formu- Table 1: Best five terms of the dictionary
las. In general, when we want to count the built using the Movie Reviews dataset.
number of documents the term t occurs, we
usually loop over the dataset and add 1 each
time we find that term in a document. In- 3.3 Supervised learning
stead, we add a value that is inversely propor- We use machine learning techniques to create
tional to the number of skips. This is what a model able to classify the polarity of new
formulas in Equations 1 and 2 do, where D texts. The documents in the dataset are em-
is the labelled dataset; |D| is the number of ployed as training instances, and the labelled
documents in D, d is a document in D, Dp is polarities are used as categories. However, in
the subset of documents in D labelled with contrast with text classification approaches,
polarity p, |t| is the number of words in term we do not create one feature per term, we cre-
t, and σ(t, d) is the number of skips of term ate a feature per polarity. In other words, we
t in document d. have the same number of features and cate-
gories. Our hypothesis is that this number
of features is enough to obtain a decent sys-
X |t|
C(t) = [t ∈ d] (1) tem quality with a low latency. The weight
d∈D
|t| + σ(t, d) of each feature is calculated as specified in
X |t| Equation 4, where w(d, p) is the weight of
C(t, p) = [t ∈ d] (2) the feature for polarity p in document d.
d∈Dp
|t| + σ(t, d)
With this counting formulas, the num- X |t|
w(d, p) = s(t, p) · (4)
ber of skips is taken into account, and we t∈d
|t| + σ(t, d)
can build our final scoring formula shown in
Equation 3, where s(t, p) is the score of term Table 2 shows an example of feature
t for the polarity p, and θ is a factor that weighting for the text “worst movie ever” us-
gives more relevance to terms that appear a ing again the scores of a dictionary built using
largest number of times. This factor depends the Movie Reviews dataset, with n = 2 and
on the size and the domain of the dataset. k = 10. The final weights (positive = 1.48,
negative = 3.40) will be employed as feature
C(t, p) C(t, p) weights for the machine learning process.
s(t, p) = · (3)
C(t) C(t, p) + θ To build our model we employed Sup-
port Vector Machines (SVM), as it has been
At the end of this process we have a list of proved to be effective on text categorisation
skipgrams with a score for each polarity: our tasks (Sebastiani, 2002; Mohammad, Kir-
sentiment lexicon. Table 1 shows an example itchenko, y Zhu, 2013). Specifically, we used
of a dictionary built using the Movie Reviews the Weka 3 (Hall et al., 2009) default imple-
dataset (Pang, Lee, y Vaithyanathan, 2002), mentation with the default parameters (lin-
with n = 2 and k = 10. In this example, we ear kernel, C = 1, = 0.1).
show only the best five terms for each polar-
3
ity. www.cs.waikato.ac.nz/ml/weka
Positive Negative of the skipgrams. We also want to add more
worst 0.00 · 1.00 0.79 · 1.00 features to the machine learning algorithm,
movie 0.48 · 1.00 0.51 · 1.00 but always trying to maintain a small num-
ever 0.52 · 1.00 0.45 · 1.00 ber of them, in order to avoid increasing the
worst movie 0.00 · 1.00 0.86 · 1.00 latency. In addition, we want to include ex-
worst ever 0.00 · 1.00 0.59 · 0.67 ternal resources and tools, such as knowledge
movie ever 0.47 · 1.00 0.37 · 1.00 from existing sentiment lexicons, but always
weight(w) 1.48 3.40 focused in real-time applications. We will
also extend our study to different corpora and
domains, to confirm the robustness of the ap-
Table 2: Example of features weights for the proach.
sentence “worst movie ever” using the scores
of a dictionary built using the Movie Reviews References
dataset with n = 2 and k = 10
Baccianella, S., A. Esuli, y F. Sebastiani.
4 Discussion 2010. Sentiwordnet 3.0: An enhanced lex-
ical resource for sentiment analysis and
In this paper we presented a supervised
opinion mining. En LREC, volumen 10,
hybrid approach for Sentiment Analysis in
páginas 2200–2204.
Twitter. We built a sentiment lexicon from
a polarity dataset using statistical measures. Balakrishnan, V. y E. Lloyd-Yemoh. 2014.
We employed skipgrams as information units, Stemming and lemmatization: a compar-
to enrich the sentiment lexicon with combina- ison of retrieval performances. Lecture
tions of words that do not appear explicitly Notes on Software Engineering, 2(3):262.
in the text. The lexicon created was used
in conjunction with machine learning tech- Chandrashekar, G. y F. Sahin. 2014. A sur-
niques to create a polarity classifier. vey on feature selection methods. Com-
Preliminary performance experiments puters & Electrical Engineering, 40(1):16–
have shown an acceptable speed to be em- 28.
ployed in real-time applications4 . Processing Cruz, F. L., J. A. Troyano, F. Enrı́quez, F. J.
speeds go from 1, 000 documents per second Ortega, y C. G. Vallejo. 2013. Long au-
in the worst cases (long texts, great values tonomy or long delay? the importance of
for n and k) to 10, 000 in the best cases domain in opinion mining. Expert Systems
(short texts, low values for n and k). These with Applications, 40(8):3174–3184.
numbers are good enough to work with
extensively used platforms like Twitter, Fernández, J., J. M. Gómez, y P. Martı́nez-
where users generate over 500 million tweets Barco. 2014. A supervised approach for
per day (this is almost 6,000 tweets per sentiment analysis using skipgrams. En
second)5 . 11th International Workshop on Natural
Moreover, experiments with different Language Processing and Cognitive Sci-
datasets have also obtained promising re- ence (NAACL).
sults (Fernández et al., 2013; Fernández,
Gómez, y Martı́nez-Barco, 2014; Fernández Fernández, J., Y. Gutiérrez, J. M. Gómez, y
et al., 2014; Gutierrez, Tomas, y Fernan- P. Martınez-Barco. 2014. Gplsi: Super-
dez, 2015; Fernández et al., 2015). Ex- vised sentiment analysis in twitter using
periments with the Movie Reviews dataset skipgrams. En Proceedings of the 8th In-
(Pang, Lee, y Vaithyanathan, 2002) obtained ternational Workshop on Semantic Eval-
an accuracy of 86.7%, with long texts in En- uation (SemEval 2014), páginas 294–299.
glish and 2-level polarity, and 64.7% with Fernández, J., Y. Gutiérrez, J. M. Gómez,
the TASS 2012 dataset (Villena-Román y P. Martı́nez-Barco, A. Montoyo, y
Garcı́a-Morera, 2013) for Spanish tweets and R. Munoz. 2013. Sentiment analysis of
6-level polarity. spanish tweets using a ranking algorithm
As future work, we plan to study new and skipgrams. En XXIX Congreso de la
methods to calculate and combine the weight Sociedad Espanola de Procesamiento de
4
Using a Macbook Pro 2.4 GHz i5 with 8GB RAM Lenguaje Natural (SEPLN 2013), páginas
5
www.internetlivestats.com/twitter-statistics 133–142.
Fernández, J., Y. Gutiérrez, J. M. Gómez, y Tweets. En Proceedings of the Interna-
P. Martı́nez-Barco. 2014. GPLSI: Super- tional Workshop on Semantic Evaluation
vised Sentiment Analysis in Twitter using (SemEval-2013).
Skipgrams. En Proceedings of the 8th In-
Pang, B. y L. Lee. 2008. Opinion
ternational Workshop on Semantic Eval-
Mining and Sentiment Analysis. Foun-
uation (SemEval 2014), numero SemEval,
dations and Trends in Information Re-
páginas 294–299.
trieval, 2(1–2):1–135.
Fernández, J., Y. Gutiérrez, D. Tomás, J. M. Pang, B., L. Lee, y S. Vaithyanathan. 2002.
Gómez, y P. Martı́nez-Barco. 2015. Eval- Thumbs up? Sentiment Classification us-
uating a sentiment analysis approach from ing Machine Learning Techniques. En
a business point of view. Conference on Empirical Methods in Nat-
Guthrie, D., B. Allison, W. Liu, L. Guthrie, ural Language Processing (EMNLP 2002),
y Y. Wilks. 2006. A Closer Look at numero July, páginas 79–86.
Skip-gram Modelling. En 5th interna- Ravi, K. y V. Ravi. 2015. A survey on opin-
tional Conference on Language Resources ion mining and sentiment analysis: tasks,
and Evaluation (LREC 2006), páginas 1– approaches and applications. Knowledge-
4. Based Systems, 89:14–46.
Gutierrez, Y., D. Tomas, y J. Fernandez. Sebastiani, F. 2002. Machine Learning in
2015. Benefits of using ranking skip-gram Automated Text Categorization. ACM
techniques for opinion mining approaches. Computing Surveys (CSUR), 34(1):1–47,
En eChallenges e-2015 Conference, 2015, 3.
páginas 1–10. IEEE.
Taboada, M., J. Brooke, M. Tofiloski,
Hall, M., E. Frank, G. Holmes, B. Pfahringer, K. Voll, y M. Stede. 2011. Lexicon-based
P. Reutemann, y I. H. Witten. 2009. The methods for sentiment analysis. Compu-
weka data mining software: an update. tational Linguistics, 37(2):267–307.
ACM SIGKDD explorations newsletter, Tan, S., X. Cheng, Y. Wang, y H. Xu. 2009.
11(1):10–18. Adapting Naive Bayes to Domain Adapta-
Jijkoun, V., M. de Rijke, y W. Weerkamp. tion for Sentiment Analysis. Advances in
2010. Generating focused topic-specific Information Retrieval, páginas 337–349.
sentiment lexicons. En Proceedings of the Tong, R. M. 2001. An operational sys-
48th Annual Meeting of the Association tem for detecting and tracking opinions in
for Computational Linguistics, páginas on-line discussion. En Working Notes of
585–594. Association for Computational the ACM SIGIR 2001 Workshop on Op-
Linguistics. erational Text Classification, volumen 1,
Kim, S.-m., M. Rey, y E. Hovy. 2004. página 6.
Determining the Sentiment of Opinions. Villena-Román, J. y J. Garcı́a-Morera. 2013.
En Proceedings of the 20th International TASS 2013-Workshop on Sentiment Anal-
Conference on Computational Linguistics ysis at SEPLN 2013: An overview. En
(COLING 2004), página 1367. XXIX Congreso de la Sociedad Española
Medhat, W., A. Hassan, y H. Korashy. 2014. de Procesamiento de Lenguaje Natural
Sentiment Analysis Algorithms and Appli- (SEPLN 2013).
cations: a Survey. Ain Shams Engineering Yang, Y. y J. O. Pedersen. 1997. A compara-
Journal. tive study on feature selection in text cat-
egorization. En Icml, volumen 97, páginas
Mohammad, S. M. 2015. Sentiment analysis:
412–420.
Detecting valence, emotions, and other af-
fectual states from text. Emotion mea-
surement, páginas 201–238.
Mohammad, S. M., S. Kiritchenko, y
X. Zhu. 2013. NRC-Canada: Building the
State-of-the-Art in Sentiment Analysis of