A Hybrid Approach based Sentiment Extraction from Medical Contexts
             1Anupam Mondal 2Ranjan Satapathy 1Dipankar Das 1Sivaji Bandyopadhyay
                        1
                       Computer Science and Engineering, Jadavpur University, India
               1
                 anupam@sentic.net, 1ddas@cse.jdvu.ac.in, 1sbandyopadhyay@cse.jdvu.ac.in
              2
                School of Computer and Information Sciences, University of Hyderabad, India
                                       2
                                         kumarsatpathy@gmail.com


                        Abstract                                    specific structured corpus, the task is challenging in Bio-
                                                                    NLP domain. To overcome the scarcity of such domain spe-
    In the domain of Bio medical Natural Language
                                                                    cific knowledge for sentiment analysis, several lexicons
    Processing (Bio-NLP), the information extraction
                                                                    have been developed like Medical Event Net (MEN), Medi-
    and context sentiment identification are treated as             cal Fact Net (MFN), Medical Belief Net (MBN) and Word-
    emerging tasks. Several linguistic features like ne-
                                                                    Net of Medical Event (WME) [Cambria et al., 2010]. These
    gation, uni-gram, bi-gram, Part-of-Speech (POS)
                                                                    lexicons help to extract the sense of a medical concept, fact
    have been used to extract the medical concepts and              and belief oriented information. The present paper reports
    their sense-based context level information. Thus,
                                                                    the development of a medical context based sentiment ex-
    in the present attempt, a hybrid approach which is
                                                                    traction model. Hence, one of our primary aims is to identi-
    the combination of both linguistic and machine                  fy the sense-based concepts from the medical contexts and
    learning approaches has been introduced to extract
                                                                    extract their related sentiment features. In order to identify
    the contextual sense-based information from a
                                                                    the sense-based medical concepts, we have introduced the
    medical corpus. The extraction of sentiment orient-             current version of WordNet of Medical Event (WME2.0)
    ed keywords is the crucial part towards identifying
                                                                    knowledge base. WME2.0 contains the medical concept
    the senses of medical contexts. In our previous
                                                                    information with their related linguistic and sense-oriented
    work, we have developed a medical sense-based                   features like POS, gloss of the concept, semantics, polarity
    lexicon known as WordNet of Medical Event
                                                                    score, affinity score, gravity score and sense(s). Among all
    (WME). Several sentiment lexicons like Senti-
                                                                    these features, we have only considered the sense-based
    WordNet, SenticNet etc. were used to represent                  features like semantics, polarity score, affinity score and
    WME. In contrast, one of our primary motivations
                                                                    sense to develop our present sentiment extraction model
    here is to build a sentiment extraction model based
                                                                    [Swaminathan et al., 2010]. On the top of extracted medical
    on medical contexts to leverage the knowledge of                concepts based on WME2.0 lexicon, we have applied lin-
    WME using a hybrid approach. The developed
                                                                    guistic and machine learning approaches to get the final
    model is based on two phases, namely pre-
                                                                    sentiment of the contexts. The linguistic approach helps to
    processing phase and learning phase. The prepro-                manage the negation of the contexts as well as derive new
    cessing phase is responsible for extracting and pre-
                                                                    rules to extract the sense(s) of such contexts. The POS, uni-
    paring structural data from the raw contexts where-
                                                                    gram, bi-gram, affinity score, polarity score and sense fea-
    as the learning phase helps to identify the senti-              tures of the medical concepts of WME2.0 help to extract the
    ment patterns and evaluate the sentiment extraction
                                                                    sentiment of the medical contexts. The supervised machine
    process. The two phased hybrid model provides us
                                                                    learning approach has been introduced to verify the contex-
    81% accuracy for extracting the sentiment based                 tual sentiment extracted using linguistic approach. In the
    medical contexts as positive and negative by em-
                                                                    process, we have applied NaïveBayes and Sequential mini-
    ploying NaïveBayes and Sequential minimal opti-
                                                                    mal optimization (SMO) supervised machine learning clas-
    mization (SMO) supervised classifiers.                          sifiers on the derived linguistic features.
                                                                       In the paper, we have incorporated both linguistic and
1   Introduction                                                    machine learning approaches together as a hybrid model to
One of the major objectives of Sentiment Analysis is to             leverage the sentiment oriented knowledge of both the do-
identify and extract the subjective information from a given        main [Villena-Romn et al., 2011]. The proposed hybrid
text using rule based or machine learning approaches [Cam-          model follows two phase architecture namely pre-
bria, 2016]. The domain specific knowledge with above               processing phase and learning phase. In pre-processing
mentioned approaches help us to extract the contextual sen-         phase, we have focused on the preparation of structured
timent information from the medical corpus. Due to lack of          medical concepts from the raw medical contexts and the
involvement of domain experts and unavailability of domain


                                                               35
Proceedings of the 4th Workshop on Sentiment Analysis where AI meets Psychology (SAAIP 2016), IJCAI 2016, pages 35-40,
                                          New York City, USA, July 10, 2016.
learning phase helps to extract the sentiment of such con-            combination of linguistic and machine learning approaches
texts and evaluate them. The two phase model generates the            [Boytcheva et al., 2005; Villena-Romn et al., 2011]. Sohn et
output in the form of positive or negative sentiment of the           al., 2012, developed an emotion identification system from
context. The hybrid approach based learning phase provides            suicide notes using the hybrid approach [Sohn et al., 2012].
81% accuracy to extract the medical context based senti-              The suicide notes were provided by the challenge organizers
ment information.                                                     of Informatics for Integrating Biology and the Bedside
   The remainder of the paper is structured as follows, Sec-          (I2B2). Machine learning, linguistic rule-based and their
tion 2 presents related work followed by model design de-             combined approaches have been applied to the training da-
scribing the pre-processing and learning phases in Section 3.         taset of the suicide notes and the system provided 0.5640
Section 4 talks about the model discussion and evaluation             micro-average F-score for the training dataset. Birks et al.,
process we have followed in the paper. Finally, in Section 5,         2009, applied the combination of RIPPER (Repeated Incre-
we present our conclusion and future scopes of the model.             mental Pruning to Produce Error Reduction), multinomial
                                                                      NaïveBayes classifier and manual pattern matching rules
2   Related Work                                                      to identify the emotions of the sentences [Birks et al., 2009].
Sentiment analysis of medical contexts is contributory and            Mondal et al., 2016, developed WordNet of Medical Events
growing research field under Bio-NLP domain [Cambria et               (WME) lexicon to identify the medical concepts and their
al., 2013]. A large number of unstructured corpora and lack           knowledge-based and semantic features using hybrid ap-
of domain experts’ involvement have introduced more chal-             proach [Mondal et al., 2015]. The latest version of WME
lenge in this task. In the process, the researchers focused on        (WME2.0) contains POS, semantics, gloss, affinity score,
developing medical sentiment-based lexicon to identify the            gravity score, polarity score and sense features of the con-
sentiments of medical concepts. Therefore, the medical con-           cepts [Mondal et al., 2016]. WME2.0 sentiment lexicon has
cepts and their sense based features indeed help to identify          identified the senses of the concepts using SentiWordNet 1,
the sentiment of the medical contexts. The linguistic, ma-            SenticNet2, BingLiu3 and Taboda’s adjective list [Mondal et
chine learning and hybrid approaches have been introduced             al., 2016; Mondal et al., 2015; Taboada et al., 2011]. In this
to build the concept and context based sentiment extraction           paper, we have used the WME2.0 lexicon to identify the
systems. The linguistic approach helps to find the negation           concepts and their features to extract sentiments of the med-
words, phrases and construct the knowledge-based rules                ical contexts.
(with unigram, bigram and n-gram features) for the context
level sentiment extraction [Elkin et al., 2005; Niu et al.,                         Figure 1: Two phase proposed Model
2005; Szarvas et al., 2008]. Smith and Fellbaum, 2004 de-
veloped a Medical Word-Net (MEN) along with two sub-
networks, namely Medical FactNet (MFN) and Medical
BeliefNet (MBN), for the evaluation of consumer health
reports [Smith and Fellbaum, 2004]. MEN was developed
with the help of formal architecture of the Princeton Word-
Net [Fellbaum, 1998]. MFN serves to assist the non-expert
group in providing a better understanding of basic medical
information. MBN identifies beliefs about the medical phe-
nomenon. Their primary motivation was to develop a net-
work of medical information retrieval systems with visuali-
zation effect. The domain-specific knowledge and the
abovementioned features are essential to improve the effi-
ciency of the sentiment extraction system [Shukla et al.,
2015]. So, these approaches were not able to provide ade-
quate accuracy due to the lack of knowledge involvement               3       Model Design
from the domain experts. Hence, to overcome the mentioned             The knowledge-based sentiment lexicon is crucial to design
problem, the researchers introduced supervised machine                a context based sentiment extraction system. The medical
learning approaches [Smith and Lee, 2012]. Standard Na-               concepts and their linguistic features are extracted from the
ïveBayes, Multinomial NaïveBayes and Support Vector                   domain-specific sentiment lexicon. To overcome the prob-
Machine (SVM) supervised classifiers were applied with                lem of experts’ availability, we have formulated WME2.0
unigram, bigram, Parts Of Speech (POS) and negation fea-              lexicon with a hybrid approach. It adds an extra dimension
tures under the machine learning framework. The research-
ers have also used hybrid approaches to improve the accura-               1
cy of the medical context based sentiment extraction sys-                     http://sentiwordnet.isti.cnr.it/
                                                                          2   http://sentic.net/
tems. One of the hybrid approaches was developed with the                 3   https://www.cs.uic.edu/liub/FBS/sentiment-analysis.html


                                                                 36
for improving the accuracy of the extracted medical context           Data Formatting: Data formatting has been applied to rep-
sentiment. The proposed hybrid approach is the combination            resent the structured form of the extracted medical concepts
of linguistic and machine learning approach. The approach             [Hussain et al., 2011]. The extracted structured (vector)
consists of two phases namely pre-processing and learning             concepts have been forwarded to the learning phase along
phase. Figure 1 shows the architecture of the proposed ap-            with their features. The concept structure is represented as
proach (model).                                                       follows:

3.1 Pre-processing phase                                              <Concept (gastric), POS (noun), Semantics (abdominal
The phase extracts the sentiments of medical contexts in the          breathing, visceral, intestinal, belly, duodenal, stomachic),
form of context related medical concepts, their sentiments            Polarity Score (-0.5), Sense (Negative)>
and knowledge-based information. The structured form of
the concepts is essential in identifying the important medical        3.2 Learning phase
concepts from the context.                                            Followed by the pre-processing phase, the hybrid approach
                                                                      has been introduced in the learning phase to build the con-
        Figure 2: Flowchart of Preprocessing Phase                    textual sentiment extraction system. Linguistic and machine
                                                                      learning has been combined to form the hybrid approach.
                                                                      The linguistic approach with WME2.0 knowledge base lexi-
                                                                      con helps to identify the hidden rules. These rules are able
                                                                      to extract the concept sentiment and their polarity. The ex-
                                                                      tracted linguistic concept features (rules) were fed to the
                                                                      supervised machine learning classifiers to evaluate the accu-
                                                                      racy of the model. The linguistic approach provides a sup-
                                                                      port to handle the negation effect of the context and help to
                                                                      identify the appropriate sentiment of the context [Huang and
                                                                      Lowe, 2007]. The learning phase is illustrated as follows:

                                                                      Step 1: Identify the polarity score and sense of each concept
                                                                      (medical and non-medical) of the context.
                                                                      Step 2: Linguistic approach-based negation words (concept)
                                                                      handling.
In this concern, to represent the structured medical concepts,        Step 3: Calculate the overall polarity of the context.
the required steps are data extraction, cleansing and format-         Context polarity = ∑ Polarityc
ting. The research community provided various linguistic              Where, c = number of concepts in the context and Polarityc
resources such as open source data preprocessing tools (viz.          indicates the polarity score of each concept.
NLTK, stemming etc.) [Na et al., 2012]. The following                 Step 4: The context sentiment has been evaluated using
steps illustrate the basic operations of the pre-processing           Context polarity score.
phase:
                                                                      4   Discussion and Evaluation
Data Extraction: The medical concepts extraction from a
given context is the primary task of this step. WME2.0 helps          The context related medical concepts and their semantic
to extract the medical concepts and their linguistic and              features (extraction polarity, semantics and sense) are re-
sense-based features from the context. Moreover, the non-             quired to identify the sentiment of the medical context
medical concepts and their sense identification are also es-          [Sarker et al., 2011]. In the process, the statistical and lin-
sential to identify the sentiment of the context. The non-            guistic features based medical sentiment lexicons were fac-
medical concepts the senses have been extracted using Sen-            ing difficulties due to the unstructured nature of the corpus.
tiWordNet and SenticNet lexicons [Cambria et al., 2014;               So, the researchers tried to build an intelligent automated
Cambria et al., 2013; Esuli and Sebastiani, 2006].                    sentiment extraction system in the Bio-NLP domain [Shukla
                                                                      et al., 2015; Sohn et al., 2012]. The system helps to extract
Data Cleansing: Data cleansing step is responsible to re-             the structured knowledge-based information with a proper
move the context related stop-words and stemmed the con-              sentiment of the context. WordNet of Medical Event
cept words. The classification of medical and nonmedical              (WME2.0) was introduced to identify the medical concept
concepts and identification of negation words (like no, not,          and their sense-based features. The WME2.0 lexicon able to
never etc.) are also taken care of by data cleansing step             extract the medical concepts and their POS, semantics,
[Huang and Lowe, 2007].                                               gloss, affinity score, gravity score, polarity score and sense.
                                                                      On the top of WME2.0 lexicon, the hybrid approach has
                                                                      been applied to extract the context level sentiment for the


                                                                 37
proposed model. The model is based on two phases namely               4.1 Evaluation Process
pre-processing and learning phase. The pre-processing                 To develop and measure the accuracy of the context level
phase has considered the concept extraction (medical and              sentiment extraction system, the data has been collected
non-medical concept), concept cleansing (concept stemming             from the open source resource4. We have extracted 7042
and stop-words removing) and concept formatting <Con-                 number of medical contexts and applied through the pro-
cept, POS, semantic, polarity score, sense> steps. The learn-         posed sentiment extraction system. The context sentiment
ing phase identified the sentiment using the linguistic and           extraction system has provided 3265 number of the positive
machine learning approaches on the pre-processing step                and 3777 number of the negative sentiments of the contexts.
driven data. The concept linguistic features and knowledge            To evaluate the extracted context sentiment, the linguistic
based WME sentiment resource help to extract the overall              features (number of negation word, context polarity score
context sentiment and polarity score. The linguistic ap-              and sense) were fed to the NaïveBayes and support vector
proach provides a support to handle the negation and identi-          based SMO supervised machine learning classifiers under
fies the correct sense of the context. The medical context            the WEKA5 tool. The extracted 7042 number of context
“No lung lesion found” has been evaluated as “positive”               data has been represented as 4900 number of training and
sentiment after handling the negation. The system first ex-           the remaining 2142 number of test dataset. The system’s
tracts the concepts and their sense as “no (-ve)”, “lung (neu-        accuracy was measured as F-Measure with four types of
tral)”, “lesion (-ve)” and “found (+ve)” using WME2.0 re-             models like, Use training set, Supplied test set, Cross-
source. The linguistic-based negation handling approach has           validation Folds 10 and Percentage split %66. Table 1
been applied on the extracted sense and identify the overall          shows the F-Measures of these modes for the NaïveBayes
context sense as “positive”. In the learning phase, the hybrid        and support vector based SMO supervised classifiers. The
                                                                      linguistic and machine learning based hybrid approach pro-
approach has been introduced to extract and measure the
                                                                      vides the accuracy score nearly 81% for the medical context
accuracy of the context sentiment. The linguistic approach
                                                                      sentiment extraction model.
involves knowledge-based medical concept mapping with
WME2.0 lexicon. Further, the NaïveBayes and Sequential                      Table 1: F-Measure of Supervised classifiers
minimal optimization (SMO) support vector based super-                            Model             NaïveBayes        SMO
vised machine learning approaches have been employed for                     Use training set          0.868          0.890
evaluating the accuracy of the model. Figure 3 and Figure 4
                                                                             Supplied test set         0.815          0.815
describe the positive and negative contexts with respect to
                                                                         Cross-validation Folds 10     0.864          0.867
the sentiment extraction process, respectively.
                                                                           Percentage split %66        0.873          0.879
           Figure 3: Positive Sentiment extraction                               Figure 4: Negative Sentiment extraction


                                                                         4   http://www.medicinenet.com/
                                                                         5   http://weka.wikispaces.com/


                                                                 38
5   Conclusion and Future scope                                         City, Mexico, November 24-30, 2013, Proceedings, Part
                                                                        II, pages 478–483, 2013.
Sentiment or opinion analysis is important to extract the
contextual information from the medical context under NLP            [Cambria, 2016] Erik Cambria. Affective computing and
domain. The context sentiment helps to identify the                     sentiment analysis. IEEE Intelligent Systems, 31(2):102–
knowledge based information and proper utilization of the               107, 2016.
context. The paper has reported a hybrid approach based              [Cambria et al., 2015] Erik Cambria, Jie Fu, Federica Bisio,
context sentiment extraction model with two phases. The                 and Soujanya Poria. Affectivespace 2: Enabling affective
phases are preprocessing (important medical keywords ex-                intuition for concept-level sentiment analysis. In Pro-
traction) and learning (respective sentiment identification).           ceedings of the Twenty-Ninth AAAI Conference on Arti-
In the process, the linguistic and machine learning combined            ficial Intelligence, January 25-30, 2015, Austin, Texas,
hybrid approach has been applied on the top of WordNet of               USA, pages 508–514, 2015.
Medical Event (WME2.0) lexicon to extract the medical                [Cambria et al., 2014] Erik Cambria, Daniel Olsher, and
concepts in order to identify the sentiment of the medical              Dheeraj Rajagopal. Senticnet 3: A common and com-
context. The medical concept polarity score and their related           mon-sense knowledge base for cognition-driven senti-
sense helps to identify the medical context sentiment [Cam-             ment analysis. In AAAI Conference on Artificial Intelli-
bria, 2013] and [Cambria et al., 2015]. WME2.0 lexicon                  gence, 2014.
driven medical concepts affinity score and their semantic
features are crucial in building the proposed model. The             [Cambria et al., 2013] Erik Cambria, Bjrn Schuller, Yun-
medical concept semantics, polarity score and affinity score            qing Xia, and Catherine Havasi. New avenues in opinion
helps to identify the medical concept sentiment with polarity           mining and sentiment analysis. IEEE Intelligent Systems,
score. The hybrid approach provides nearly 81% accuracy                 28(2):15–21, 2013.
for the proposed context sentiment extraction system.                [Hussain et al., 2011] Hussain A Cambria E and Eckl C.
Hence, the future research will focus to develop some prac-             Bridging the gap between structured and unstructured
tical applications relating to the current work as medical              health- care data through semantics and sentics. In Pro-
annotation and context summarization system. These sys-                 ceedings of ACM WebSci, Koblenz, 2011.
tems will provide the support to the expert and non-expert
                                                                     [Elkin et al., 2005] Peter L. Elkin, Steven H. Brown, Brent
groups in their respective applications.
                                                                        A. Bauer, Casey S. Husser, William Carruth, Larry R.
                                                                        Bergstrom and Dietlind L. Wahner-Roedler. A con-
References                                                              trolled trial of automated classification of negation from
[Mondal et al., 2016] Anupam Mondal, Dipankar Das, Erik                 clinical notes. BMC Medical Informatics and Decision
   Cambria and Sivaji Bandyopadhyay. WME: Sense, po-                    Making, 5(1):1–7, 2005.
   larity and affinity based concept resource for medical            [Esuli and Sebastiani, 2006] Andrea Esuli and Fabrizio Se-
   events. In Proceedings of the Eighth Global WordNet                  bastiani. Sentiwordnet: A publicly available lexical re-
   Conference, pages 242–246, 2016.                                     source for opinion mining. In Proceedings of the 5th
[Birks et al., 2009] Yvonne Birks, Jean McKendree, and Ian              Conference on Language Resources and Evaluation
   Watt. Emotional intelligence and perceived stress in                 (LREC06), pages 417–422, 2006.
   healthcare students: a multi-institutional, multi-                [Fellbaum, 1998] Christiane Fellbaum. WordNet: an elec-
   professional survey. BMC Medical Education, 9(1):1–8,                tronic lexical database. MIT Press, 1998.
   2009.
                                                                     [Huang and Lowe, 2007] Yang Huang and Henry J. Lowe.
[Boytcheva et al., 2005] Svetla Boytcheva, Albena Strup-                A novel hybrid approach to automated negation detec-
   chanska, Elena Paskaleva, Dimitar Tcharaktchiev, and                 tion in clinical radiology reports. Journal of the Ameri-
   Dame Gruev Str. Some aspects of negation processing in               can Medical Informatics Association: JAMIA,
   electronic health records. In Proceedings of Internation-            14(3):304–311, May 2007.
   al Workshop Language and Speech Infrastructure for In-
                                                                     [Mondal et al., 2015] Anupam Mondal, Iti Chaturvedi,
   formation Access in the Balkan Countries. Pages 1—8,
                                                                        Dipankar Das, Rajiv Bajpai, and Sivaji Bandyopadhyay.
   2005.
                                                                        Lexical resource for medical events: A polarity based
[Cambria et al., 2010] E. Cambria, A. Hussain, T. Durrani,              approach. In IEEE ICDM Workshops, pages 1302–1309.
   C. Havasi, C. Eckl, and J. Munro. Sentic computing for               IEEE, 2015.
   patient centered applications. In IEEE 10th International
                                                                     [Na et al., 2012] Jin-Cheon Na, Wai Yan Min Kyaing,
   Conference on Signal Processing Proceedings, pages
                                                                        Christopher SG Khoo, Schubert Foo, Yun-Ke Chang,
   1279–1282, Oct 2010.
                                                                        and Yin-Leng Theng. Sentiment classification of drug
[Cambria, 2013] Erik Cambria. An introduction to concept-               reviews using a rule-based linguistic approach. In The
   level sentiment analysis. In Advances in Soft Computing              outreach of digital libraries: a globalized resource net-
   and Its Applications - 12th Mexican International Con-               work, pages 189–198. Springer, 2012.
   ference on Artificial Intelligence, MICAI 2013, Mexico


                                                                39
[Niu et al., 2005] Yun Niu, Xiaodan Zhu, Jianhua Li, and
   Graeme Hirst. Analysis of polarity information in medi-
   cal text. In Proceedings of the American Medical Infor-
   matics Association Annual Symposium, 2005.
[Sarker et al., 2011] Abeed Sarker, Diego Moll´a-Aliod,
   C´ecile Paris, et al. Outcome polarity identification of
   medical papers. Melbourne: Australian Language Tech-
   nology Association. 2011.
[Shukla et al., 2015] Ravi Shankar Shukla, Kamendra Singh
   Yadav, Syed Tarif Abbas Rizvi, and Faisal Haseen. An
   Efficient Mining of Biomedical Data from Hypertext
   Documents via NLP. In Proceedings of the 3rd Interna-
   tional Conference on Frontiers of Intelligent Computing:
   Theory and Applications (FICTA) 2014: Volume 1, pag-
   es 651–658. Springer International Publishing, Cham,
   2015.
[Smith and Fellbaum, 2004] Barry Smith and Christiane
   Fellbaum. Medical wordnet: A new methodology for the
   construction and validation of information resources for
   consumer health. In Proceedings of COLING, 2004.
[Smith and Lee, 2012] Phillip Smith and Mark Lee. Cross-
   discourse development of supervised sentiment analysis
   in the clinical domain. In Proceedings of the 3rd Work-
   shop in Computational Approaches to Subjectivity and
   Sentiment Analysis, WASSA ’12, Association for Compu-
   tational Linguistics, pages 79–83, Stroudsburg, PA,
   USA, 2012.
[Sohn et al., 2012] Sunghwan Sohn, Manabu Torii, Ding-
   cheng Li, Stephen Wu, Hongfang Liu, and Avishwar
   Wagholikar. A Hybrid Approach to Sentiment Sentence
   Classification in Suicide Notes. In Biomedical Informat-
   ics Insights, pages 43+, January 2012.
[Swaminathan et al., 2010] Rajesh Swaminathan, Abhishek
   Sharma, and Hui Yang. Opinion mining for biomedical
   text data: Feature space design and feature selection. In
   The Nineth International Workshop on Data Mining in
   Bioinformatics, BIOKDD, 2010.
[Szarvas et al., 2008] Gy¨orgy Szarvas, Veronika Vincze,
   Rich´ard Farkas, and J´anos Csirik. The bioscope corpus:
   annotation for negation, uncertainty and their scope in
   biomedical texts. In Proceedings of the Workshop on
   Current Trends in Biomedical Natural Language Pro-
   cessing, Association for Computational Linguistics, pag-
   es 38–45, Columbus, Ohio, June 2008.
[Taboada et al., 2011] Maite Taboada, Milan Tofiloski, Jul-
   ian Brooke, Kimberly Voll, and Manfred Stede. Lexi-
   con-based methods for sentiment analysis. Journal of
   Computational linguistics, volume 37, number 2, pages
   267-307, publisher MIT Press, 2011.
[Villena-Romn et al., 2011] Julio Villena-Romn, Sonia Col-
   lada-Prez, Sara Lana-Serrano, and Jos Carlos Gonzlez
   Cristbal. Hybrid approach combining machine learning
   and a rule-based expert system for text categorization. In
   R. Charles Murray and Philip M. McCarthy, editors,
   FLAIRS Conference. AAAI Press, 2011.


                                                                40