<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detecting events and sentiment on Twitter for improving Urban Mobility</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>A. Candelieri</string-name>
          <email>candelieri@milanoricerche.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>F. Archetti</string-name>
          <email>archetti@milanoricerche.it</email>
          <email>francesco.archetti@unimib.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Consorzio Milano Ricerche</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, Systems and Communication - University of Milano-Bicocca</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The streams of tweets from and to the Twitter account of urban transport operators have been considered. A computational module has been designed and developed in order to collect tweets and, on the fly, analyze them to detect some relevant event (e.g. accidents, sudden traffic jams, service interruption, etc.) and/or evaluate possible sentiments and opinions about the quality of service. Events are recognized through a simple word matching while sentiment analysis is performed via supervised learning (Support Vector Machine). The text mining solutions have been developed to work with Italian language; however they could be easily extended to other languages in the case tweets in other languages would be available. This approach has been tested for the urban transportation in Milan (Azienda Trasporti Milano, ATM) in the framework of the TAMTAM project which has developed a technological platform for improving urban mobility by exploiting the large amount of information shared by the users of transportation services through Twitter. Events detected are used by other software modules of the TAM-TAM platform in order to support a more effective travel planning, while sentiment inferred may be used by the transport provider in order to tune the mobility supply to the commuter needs.</p>
      </abstract>
      <kwd-group>
        <kwd>smart urban mobility</kwd>
        <kwd>sentiment analysis</kwd>
        <kwd>crowdsourcing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The relevance of “narrative aware design framework” in the design and
implementation of smart urban environments has been already highlighted in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The
combined diffusion of smart mobile devices and social networks have been rapidly
increasing the amount of contents generated by users, making crowdsourcing a huge source of
potentially useful – usually unstructured – information to transform in actionable
knowledge for services/products innovation as well improving urban quality of life.
According to this vision, the Italian project TAM-TAM, co-funded by the Italian
Ministry of Education, University and Research together with Regione Lombardia, has
designed and developed a technological platform able to combine information from
official data sources and the huge amount of unstructured information generated through
crowdsourcing, even on the move, and related to transportation services in the city of
Milan. Citizens, commuters and tourists already adopt socially awareness and
collective intelligence to make more personalized and informed mobility decisions, mainly
by reading and sharing short messages on Twitter. The aim of TAM-TAM is to close
in the loop these streams of data and analyse them in order to provide users with
addedvalue services. The benefits provided by the automatic analysis of tweets have been
already investigated and proved in other domains, such as the automatic detection of
anomalies related to power outage events during hurricane Irene on August 27, 2011
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. More recently the attention is focusing on terrorism, radicalization and hate-speech
[
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. With respect to these applications, where a first analysis is performed to
discriminate between relevant and irrelevant tweets, limiting the collection of tweets to those
posted from and to the Twitter account of the transportation company permits to
consider all of them as relevant. The authors of this paper have been designing the tweets
collection and analysis component of TAM-TAM, which is overall aimed at providing
innovative services through:
 integration of data and information coming from different sources, both official and
crowd-sourced (e.g., time-tables, on-line positioning data, traffic estimation, etc.);
 supporting intermodal and personalized transport options;
 computational modules for expressive-media contents analysis, based on sentiment
and opinion mining techniques [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], for event detection and evaluation of the
perceived quality of transport service;
 a travel planning software to provide users with information on costs, time,
environmental impact and perceived quality of service with respect to the opinions of the
other commuters;
 decision support functionalities to identify and address criticalities in the proposed
urban transportation supply, enabling more effective and efficient plans according
to variations in mobility users preferences.
      </p>
      <p>
        The contribution of this paper consists in the development and validation of a
computational module devoted to collect tweets, both from and to the Twitter account of
the public transportation company in Milan, Azienda Trasporti Milano (ATM), and
then analyse their content according to the following two goals: the automatic
identification of events (e.g., accidents, sudden traffic jams, etc.), as posted by the users, the
automatic detection of opinions about transport service (e.g., delays, inefficiencies,
perceived security, dirt, etc.). Some preliminary results obtained during the first activities
of the project have been initially reported in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], where an initial design of the
computational module is presented, further specialized in this paper.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>TAM-TAM: general architecture</title>
      <p>Figure 1 summarizes the overall architecture of the TAM-TAM platform, with a
major focus on the component devoted to the analysis of tweets. The other relevant
components and services of the platform are: i) the central database used to provide the
different visualization layers related to official – structured – information, such as lines,
time-tables, on-line positioning data, traffic estimation, etc.; ii) web and mobile apps
for login/profiling and visualization.</p>
      <p>Going more in detail, the main modules and functioning of the tweets analyser
component are the following:</p>
      <p>Crawler</p>
      <p>Crawler is the module devoted to continuously collect tweets from (bold line) and to
(dotted line) the Twitter account of the urban transportation company in Milan
(@atm_informa). Moreover, Crawler is also devoted to store the acquired tweets,
according to the data model provided by Twitter API, into a MySQL database (Tweets
database) which is then used to perform further (off-line) analysis aimed at validating
new machine learning algorithms and mine new models.</p>
      <p>Event Detector</p>
      <p>Event Detector implements a simple word-matching algorithm in order to identify,
within tweets, keywords associated to relevant events. The set of keywords is based on
the set of “standard” words generally used by ATM to inform customers about relevant
events (e.g. strokes, accidents, interruptions, deviations, etc.) but it is completely
customizable. The same set of keywords is also used to detect potential events
communicated by the commuters. Finally, Event Detector search for other words, and their
synonyms, referred to: type of transportation (i.e. bus, tram or underground), specific line
and, where available, direction. All this information is well defined in the lists which
could be retrieved from the web site of ATM.</p>
      <p>While the events detected in tweets from @atm_informa are certain, the events
detected in commuters-generated tweets have to be validated; this action is performed by
considering the rate of tweets related to the same events in the last 15 minutes. Higher
the rate higher the trustworthiness about the event; when the rate of an event becomes
0 the event is no more valid. This is very important because commuters are generally
used to report and share information about events but not about the return to the
normality, while transportation supplier communicates disruptive events as well as their
rehabilitation.</p>
      <p>Events are internally stored into the Events database – according to a structured
format (i.e. type of event, type of transportation option, specific line, direction, timestamp,
number of tweets in the last 15 minutes) – in order to perform all the rate-based
considerations; subsequently, the Event Detector updates events within the TAM-TAM’s
Central Database, eventually modifying the number of related tweets in the last 15
minutes of a specific event or removing those which are no more “active” (rate=0).
Continuously, the data in the Central Database are retrieved by other computational
modules, in particular the trip planning applications in order to optimize trip according
to the current situation on the urban transportation network (i.e. delays, events on a
specific line, etc.)</p>
      <p>Sentiment Analyser (pre-processing and neutral-positive-negative classification)
As the tweets published by the transportation company are only related to official
communications and responses to requests by the commuters, they are not analysed for
sentiment analysis. This is the reason why only dotted line goes through the
corresponding computational modules (Figure 1).</p>
      <p>
        The detection and further evaluation of possible sentiment in the tweets shared by
commuters are performed through different sequential steps. First of all, some
pre-processing is performed to transform the tweet in a vector of valued-features which can be
analysed through Machine Learning algorithms. This pre-processing consists in
removing stop-words (i.e. articles, prepositions and punctuations). Although the authors are
conscious that emoticons may be used to enforce effectiveness of sentiment mining [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
in this first prototype of the tweets analyser they are not considered. Furthermore, the
impact of applying – or not – stemming has also been considered, by using Snowball
Stemmer (http://trimc-nlp.blogspot.it/2013/08/snowball-stemmer-for-java.html). The
following pre-processing step consists in transforming the filtered tweets in a vector of
valued-features. This procedure is better defined in section 3. It basically consists in a
variation of a variation of the well-known TF-IDF (Term-Frequency – Inverse
Document Frequency) weighting scheme, where features are computed differentially for
each classification task (i.e. neutral vs not-neutral and positive vs negative). Similar
tweets acquisition and analysis systems have been recently proposed, more specifically
for English language, and for general purposes [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref9">9-13</xref>
        ] as well for urban mobility [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>After the current tweet has been pre-processed, only features related to TF-IDF for
Neutral vs Not-neutral classification are given as input for a trained Support Vector
Machine (SVM) classifier (details about the SVM learning are provided in section 3).
The proposed classification output is then stored into the Sentiment database; in the
case the output is “not-neutral” the values of the TF-IDF features for Positive vs
Negative, of the current tweet, are given as input to a further SVM classifier, specifically
trained. As in the previous step, the classification output is stored into the Sentiment
database in order to enable, through the Query Executor module, the retrieval of useful
information to support the transportation company in making decisions aimed at
increasing commuters’ satisfaction.</p>
    </sec>
    <sec id="sec-3">
      <title>Materials and Methods</title>
      <p>Design and development of the tweets analyser of the TAM-TAM platform initially
required to collect a set of tweets to be used for the training and validation of sentiment
mining classifiers. Collection was started on 12th June 2013 and is still in progress, for
tweets posted both from and to the account of the public transport company in Milan
(currently the collected tweets are around 45,000). A set of 1,332 collected tweets has
been labelled by 3 different human supervisors according to the possible following
three alternatives: neutral (570), positive (127) or negative (635). No specific training
has been provided to the “labellers”; the set of tweets has been randomly given to each
supervisor, separately, asking for a judgement about the sentiment. Mean Kappa
statistics was 0.96, showing a high agreement among the labellers; final label of every tweet
is the more frequent one (“neutral” is given in the case of 3 discordant labels).</p>
      <p>
        To transform a tweet in a vector of features, the authors had taken into account
specific considerations about the properties of tweets with respect to other types of text
contents. As tweets are short messages, usually unstructured and informally written,
techniques like parsing, pattern matching, complex grammars are usually ineffective.
In [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] the solution proposed to analyse the content is a representation where features
are terms and each feature is valued by the frequency of each term, which could be a
word or n-gram. More simply, in [
        <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
        ] the features are terms and they are valued as
Boolean (1 if the term is present in the text, 0 otherwise). Other approaches propose a
representation based on some computation; in [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] words are weighted by their
correspondent Inverse Document Frequency (IDF) score, that is the logarithm of the number
of documents in the collection divided by the number of documents containing a
specific word [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Alternatively, the score known as Term Frequency–Inverse Document
Frequency (TF-IDF) may be adopted, that is the IDF score multiplied by the frequency
of a specific word divided by the number of words in the document [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. In a recent
study, proposing Bayesian Ensemble Learning for sentiment analysis, these approaches
for feature construction are compared [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] an extension of the TF-IDF approach is proposed, consisting in weighting
words by the difference of their TF-IDF scores (delta TF-IDF) with respect to the class
associated to the text (i.e. positive or negative sentiment). The Support Vector Machine
(SVM) classification learning technique [
        <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
        ] has been used to identify a reliable
model able to detect the polarity of a document with respect to the computed
delta-TFIDF. In particular, the proposed delta TF-IDF is defined as follows:
  , =   , ∗
      </p>
      <p>| |
2 (   ) −   , ∗</p>
      <p>| |
2 (   ) =   , ∗</p>
      <p>| |   )
2 (   | |
where Vt,d is the value of the term (feature) t in document d, Ct,d is the frequency of
term t in document d, Pt is the number of positively labelled documents containing term
t, |P| is the number of the positively labelled documents, Nt is the number of negatively
labelled documents containing term t, |N| is the number of negatively labelled
documents. This approach proved to be more accurate with respect to the other ones and is
the core of the application presented in this paper. In particular, two different delta
TFIDF representations are computed, one for Neutral vs Not-neutral and one for Positive
vs Negative classification, respectively.</p>
      <p>The dataset of the 1,332 labelled tweets has been first divided into two different
datasets, one related to tweets having neutral and not-neutral labels and one related to
tweets having positive and negative labels. Then, tweets in each one of these datasets
have been pre-processed, accordingly to the procedure described in previous section 2,
and delta TF-IDF has been computed for each term. Using and not using stemming has
been considered, thus two different datasets have been generated from each of the
previous ones, characterized by a different set of features.</p>
      <p>Furthermore, in order to reduce dimensionality, features have been ranked according
to the corresponding delta TF-IDF and only the first n relevant features (terms) have
been selected for each class (where n has been experimentally set to 10 in the case
stemming is not adopted and 15 in the case of using stemming). Taking into account
this step, the number of initially labelled tweets is reduced because some tweets could
contain no one of the selected features. Table 1 summarizes the figures of each one of
the datasets built starting from the initial set of the 1,332 labelled tweets. In order to
use all the available data, the two classification learning tasks have been performed
separately, while the two steps classification is only performed on new coming tweets
when the module is deployed within the platform.</p>
      <p>Therefore, the number of “positive vs negative” tweets does not add-up to
“not-neutral” due to the different filtering performed, for instance: when the original 1,332
tweets are filtered according to the 10 most relevant features for “neural vs not-neutral”
(no-Stemming case), 554 tweets (1,332-778) are removed because they do not contain
any of the selected words. Similarly, 559 tweets are selected, among the 1,332 having
a not-neutral, when the filtering (no-Stemming) is applied.</p>
      <p>As first result, the list of terms ranked according to delta TF-IDF values is reported
in Table 2, with respect to the classification tasks, with and without stemming.</p>
      <p>
        With respect to the classification learning task, a combination between the SVM
implementation provided by WEKA suite (Waikato Environment for Knowledge
Analysis, http://www.cs.waikato.ac.nz/ml/index.html) and Genetic Algorithms – aimed to
optimize SVM configuration (regularization C and γ of Radial Basis Function Kernel)
– has been used [
        <xref ref-type="bibr" rid="ref24 ref25 ref26">24-26</xref>
        ].
      </p>
      <p>As the classes are unbalanced, the Balanced Classification Accuracy and F-score
have been used to select the best performing SVM classifier according to a 10
foldcross validation procedure. Furthermore, SVM has been also compared to other
classification learning algorithms offered by the WEKA suite, in particular the ZeroR
classifier, which classify any instance as belonging to the most frequent class in the dataset
(baseline), Artificial Neural Network (RBF-Network and Multi-Layer Perceptron,
MLP) and Naïve Bayes. Table 3 summarizes the obtained results.</p>
      <p>Neutral vs Not-neutral
(without stemming)
Positive vs Negative
(without stemming)
Neutral vs Not-neutral
(with stemming)
Positive vs Negative
(with stemming)</p>
      <p>Balanced Accuracy and F-score are almost similar across the different classification
learning algorithms and higher than baseline. According to the definition of BAC (i.e.
average between sensitivity and specificity), its value is always 50% for the ZeroR and
only F-score varies. SVM proved to be the most performing classification learning
strategy, however, some differences resulted among the available datasets: in particular
performances are higher in the case of Positive vs Negative classification than Neutral
vs Not-neutral classification, while stemming does not make any difference in Neutral
vs Not-neutral as well as Positive vs Negative classification.</p>
      <p>As final decision, stemming has been adopted for “Neutral vs Not-neutral”
classification but not for “Positive vs Negative” classification. Therefore, in the pre-processing
step every tweet generates two different vectors: the first (stemmed) is the input of
“Neutral vs Not-neutral” classification, while the second (not-stemmed) is the input of
“Positive vs Negative” classification, if and only if it is classified as “Not-neutral” at
the first step.</p>
      <p>In the following Table 4 the SVM configurations associated to the performances in
previous Table 2 are reported, along with the number percentage of overall instances
used as Support Vectors (%SVs). This is another important index for evaluating the
capability for any SVM classifier to correctly classify new instances not used for
learning. It is easy to note that, according to both Balanced Accuracy and %SVs, the Neutral
vs Not-neutral classification is more difficult than Positive vs Negative classification.</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>The developed tweets analyser module, based on text mining and SVM classification
and deployed into the prototype of the TAM-TAM platform, enabled innovative
addedvalue services for commuters, aimed at improving urban mobility in the city of Milan.
While event detection is used to optimize trip planning, sentiment analysis is currently
more devoted to support transportation supplier in addressing commuters’ needs and
improve their satisfaction. On the other hand, the idea is to use the output of sentiment
analysis according to a collective intelligence paradigm by providing also commuters
with information about the perceived quality of transportation service, and specific
mobility options, as spontaneously reported by the other commuters. This will allow users
of the transportation service, citizens as well as tourists, to plan their trips by also
considering some social indicators of satisfaction.</p>
      <p>Currently the most relevant limitations of the work are two: the solution strictly
depends on language as it has been currently validated only on Italian and the limited
dataset of labelled tweets. While the first limitation is not yet so relevant, since almost
all the tweets from and to @atm_informa are written in Italian, the second could be the
reason of lower accuracy in the Neutral vs Not-neutral classification. Gamification
based apps, aimed at enabling labelling by TAM-TAM users, have already been
identified as effective solutions for increasing both the number of labelled tweets over time
and labels objectivity according to the judgements provided by multiple users.
6</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Srivastava</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vakali</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Towards a narrative aware design framework for smart urban environment</article-title>
          . F. Álvarez et al. (Eds.):
          <source>FIA</source>
          <year>2012</year>
          , LNCS
          <volume>7281</volume>
          ,
          <fpage>166</fpage>
          -
          <lpage>177</lpage>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Candelieri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Archetti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giordani</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arosio</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sormani</surname>
          </string-name>
          , R.:
          <article-title>Smart cities management by integrating sensors, models and user generated contents</article-title>
          .
          <source>WIT Transactions on Ecology and the Environment</source>
          ,
          <volume>179</volume>
          (
          <issue>1</issue>
          ),
          <fpage>719</fpage>
          -
          <lpage>730</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Thom</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bosch</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koch</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Worner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ertl</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Spatiotemporal Anomaly Detection through Visual Analysis of Geolocated</article-title>
          .
          <source>IEEE Pacific Visualization Symposium</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Burnap</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rana</surname>
            ,
            <given-names>O.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Avis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Housley</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Edwards</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morgan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sloan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Detecting tension in online communities with computational Twitter analysis</article-title>
          .
          <source>Technological Forecasting and Social Change</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Burnap</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>M. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sloan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rana</surname>
            ,
            <given-names>O. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Housley</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Edwards</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knight</surname>
            ,
            <given-names>V. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morgan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Procter</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Voss</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Tweeting the terror: modelling the social media reaction to the Woolwich terrorist attack</article-title>
          .
          <source>Social Network Analysis and Mining</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Opinion Mining and Sentiment Analysis</article-title>
          .
          <source>Foundations and Trends in Information Retrieval</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          , 2),
          <fpage>1</fpage>
          -
          <lpage>135</lpage>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Candelieri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Archetti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Analyzing tweets to enable sustainable, multi-modal and personalized urban mobility: Approaches and results from the Italian project TAM-TAM</article-title>
          .
          <source>WIT Transactions on the Built Environment</source>
          ,
          <volume>138</volume>
          ,
          <fpage>373</fpage>
          -
          <lpage>379</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Pozzi</surname>
            ,
            <given-names>F.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maccagnola</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fersini</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Messina</surname>
          </string-name>
          , E.:
          <article-title>Enhance user-level Sentiment Analysis on microblogs with approval relations</article-title>
          .
          <source>Proceeding of the 13th International Conference on Advances in Artificial Intelligence</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Burnap</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rana</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Housley</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Edwards</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morgan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sloan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Conejero</surname>
            ,
            <given-names>J.: COSMOS</given-names>
          </string-name>
          :
          <article-title>Towards an integrated and scalable service for analysing social media on demand</article-title>
          .
          <source>Intern. Journal of Parallel, Emergent and Distributed Systems</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Amati</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bianchi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marcone</surname>
          </string-name>
          , G.:
          <article-title>Sentiment Estimation on Twitter (http://ceurws</article-title>
          .org/Vol-
          <volume>1127</volume>
          /paper7.pdf)
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Musto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Semeraro</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lops</surname>
            , P., de Gemmis,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Narducci</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bordoni</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Annunziato</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meloni</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Orsucci</surname>
            ,
            <given-names>F. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paoloni</surname>
          </string-name>
          , G.:
          <article-title>Developing a Semantic Content Analyzer for L'Aquila Social Urban Network (http://ceur-ws</article-title>
          .
          <source>org/</source>
          Vol-
          <volume>1127</volume>
          /paper6.pdf)
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Amati</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Angelini</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bianchi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Costantini</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marcone</surname>
          </string-name>
          , G.:
          <article-title>A scalable approach to near real-time sentiment analysis on social networks</article-title>
          .
          <source>In DART</source>
          <year>2014</year>
          ,
          <article-title>Information Filtering and Retrieval</article-title>
          .
          <source>Proceedings of the 8th International Workshop on Information Filtering and Retrieval</source>
          , co
          <article-title>-located with XIII AI*IA Symposium on Artificial Intelligence (AI*IA</article-title>
          <year>2014</year>
          ), Pisa, Italy, December
          <volume>10</volume>
          ,
          <year>2014</year>
          . CEUR Workshop Proceedings,
          <volume>1314</volume>
          ,
          <fpage>12</fpage>
          -
          <lpage>23</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Musto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Semeraro</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Polignano</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>A Comparison of Lexicon-based Approaches for Sentiment Analysis of Microblog Posts</article-title>
          .
          <source>In DART</source>
          <year>2014</year>
          ,
          <article-title>Information Filtering and Retrieval</article-title>
          .
          <source>Proceedings of the 8th International Workshop on Information Filtering and Retrieval, colocated with XIII AI*IA Symposium on Artificial Intelligence (AI*IA</source>
          <year>2014</year>
          ), Pisa, Italy, December
          <volume>10</volume>
          ,
          <year>2014</year>
          . CEUR Workshop Proceedings,
          <volume>1314</volume>
          ,
          <fpage>59</fpage>
          -
          <lpage>68</lpage>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krishnan</surname>
          </string-name>
          , R.:
          <article-title>Transportation Sentiment Analysis for Safety Enhancement</article-title>
          ,
          <source>Final Project Report. Technologies for Safe and Efficient Transportation</source>
          , Carnegie Mellon University (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Joachims</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Text Categorization with Support Vector Machines: Learning with Many Relevant Features</article-title>
          , Springer (
          <year>1997</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vaithyanathan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Thumbs up? Sentiment classification using machine learning techniques</article-title>
          .
          <source>In Proceedings of EMNLP</source>
          (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Whitelaw</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garg</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Argamon</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Using appraisal groups for sentiment analysis</article-title>
          .
          <source>In Proceedings of the 14th ACM International Conference on Information and Knowledge Management</source>
          ,
          <fpage>625</fpage>
          -
          <lpage>631</lpage>
          (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pantel</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chklovski</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pennacchiotti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Automatically assessing review helpfulness</article-title>
          .
          <source>In Proceedings of EMNLP</source>
          ,
          <fpage>423</fpage>
          -
          <lpage>430</lpage>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Baeza-Yates</surname>
            ,
            <given-names>R.A..: Modern</given-names>
          </string-name>
          <string-name>
            <surname>Information Retrieval. Addison-Wesley Longman</surname>
          </string-name>
          Publishing Co.
          <article-title>(</article-title>
          <year>1999</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Fersini</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Messina</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pozzi</surname>
            ,
            <given-names>F.A.</given-names>
          </string-name>
          :
          <article-title>Sentiment analysis: Bayesian Ensemble Learning</article-title>
          .
          <source>Decision Support Systems</source>
          ,
          <volume>68</volume>
          ,
          <fpage>26</fpage>
          -
          <lpage>38</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Martineau</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Finin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Delta</surname>
            <given-names>TFIDF</given-names>
          </string-name>
          :
          <article-title>An Improved Feature Space for Sentiment Analysis</article-title>
          .
          <source>In Proceedings of the Third International ICWSM Conference</source>
          ,
          <volume>258</volume>
          -
          <fpage>261</fpage>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Scholkopf</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smola</surname>
            ,
            <given-names>A. J.</given-names>
          </string-name>
          :
          <article-title>Learning with kernels. Support Vector Machines, regularization, optimization and beyond</article-title>
          . Massachussetts Institute of Technology, USA (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Vapnik</surname>
          </string-name>
          , V.:
          <article-title>Statistical Learning Theory</article-title>
          . New York, Wiley (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Candelieri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A hyper-solution framework for classification problems via metaheuristic approaches</article-title>
          .
          <year>4OR</year>
          ,
          <issue>9</issue>
          (
          <issue>4</issue>
          ),
          <fpage>425</fpage>
          -
          <lpage>428</lpage>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Candelieri</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Conforti</surname>
            <given-names>D.:</given-names>
          </string-name>
          <article-title>A Hyper-Solution Framework for SVM Classification: Application for Predicting Destabilizations in Chronic Heart Failure Patients</article-title>
          .
          <source>The Open Medical Informatics Journal</source>
          ,
          <volume>4</volume>
          ,
          <fpage>136</fpage>
          -
          <lpage>140</lpage>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Candelieri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sormani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arosio</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giordani</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Archetti</surname>
          </string-name>
          . F.:
          <article-title>A Hyper-solution Framework for SVM Classification: Improving Damage Detection on Helicopter Fuselage Panels</article-title>
          .
          <source>ASRI</source>
          <year>2013</year>
          ,
          <article-title>Conf</article-title>
          .
          <source>on Intelligent Systems and Control. AASRI Procedia 4</source>
          ,
          <fpage>31</fpage>
          -
          <lpage>36</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>