<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Neural models for StanceCat shared task at IberEval 2017</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Luca Ambrosini</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giancarlo Nicolo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>@inf.upv.es</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <fpage>210</fpage>
      <lpage>216</lpage>
      <abstract>
        <p>This paper describes our participation in the Stance and Gender Detection in Tweets on Catalan Independence (StanceCat) task at IberEval 2017. Our approach was focused on neural models, rstly using classical and speci c model from state of the art, then we introduce a new topology of convolutional network for text classi cation. The raising of social networks as worldwide means of communication and expression, is gaining lot of interest from company and academia, due to the huge availability of daily contents published by users. Focusing on academia perspective, especially in the Natural Language Processing eld, the contents available in form of written text are really useful for the study of speci c open problems, where the stance detection related to political events is an example, and the Stance and Gender Detection in Tweets on Catalan Independence (StanceCat) task at IberEval 2017 is a concrete application. In StanceCat, the principal aim is to automatically detect if the text's author is in favor of, against, or neutral towards the Catalan Independence. Moreover, as a secondary aim, participants are asked to infer the author's gender. To tackle the described problem we built a stance&amp;gender detection system mainly decomposed in two modules: text pre-processing and classi cation model. During the system's tuning process, di erent design choices were explored trying to nd the best modules' combination and from their anlysis some interesting insight can be drawn. In the following sections we rstly describe the StanceCat task (Section 2), then we illustrate the module's design of developed stance&amp;gender detection system (Section 3), after that, an evaluation of the tuning process for submitted systems is analysed (Section 4), nally, conclusion over the whole work are outlined (Section 5).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Regarding the text pre-preprocessing, has to be mentioned that the corpus under observation can
not be treated as proper written language, because computer-mediated communication (CMC) is
highly informal, a ecting diamesic3 variation with creation of new items supposed to pertain
lexicon and graphematic domains [
        <xref ref-type="bibr" rid="ref7 ref8">7,8</xref>
        ]. Therefore, our pre-processing follows two approaches: classic
and microblogging related. As classic aproach we used stemming (i.e., ST), stopwords (i.e., SW)
and punctuation removal (i.e., PR). For microblogging approach we focus our attention over the
following items: (i) mentions (i.e., MT), (ii) smiley (i.e., SM), (iii) emoji (i.e., EM), (iv) hashtags
(i.e., HT), (v) numbers (i.e., NUM), (vi) URL (i.e., URL) (vii) and Tweeter reserve-word as RT
and FAV (i.e., RW). For each of these items we leave the possibility to be removed or substituted
by constant string. In relation to above approaches we implement them using the following tools:
(i) NLTK [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and (ii) Preprocessor4.
3.2
      </p>
      <sec id="sec-1-1">
        <title>Classi cation models</title>
        <p>Following, we describe the neural models used for the classi cation module. Before introducing
the models we describe the speci c text representation used as input layer Section 3.2 (i.e.,
sentence-matrix).</p>
        <p>
          Text representation To represent the text we used word embeddings as described by [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ],
where words are represented as vectors of real number with xed dimension jvj. In this way a
whole sentence s, with length jsj its number of word, is represented as a sentence-matrix M of
dimension jM j = jsj jvj. jM j has to be xed a priori, therefore jsj and jvj have to be estimated.
jvj was xed to 300 following [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. jsj was estimated analyzing table 2, in details we decided to x
it as the sum of average length plus the standard deviation (i.e. jsj = 17 for both language), with
this choice input sentences longer than jsj are truncated, while shorter ones are padded with null
vectors (i.e., a vector of all zeros).
        </p>
        <p>Choosing words as elements to be mapped by the embedding function, raise some challenge
over the function estimation related to data availability. In our case the available corpus is very
small and estimated embeddings could lead to low performance. To solve this problem, we decided
3 The variation in a language across medium of communication (e.g. Spanish over the phone versus</p>
        <p>
          Spanish over email)
4 Preprocessor is a preprocessing library for tweet data written in Python,
https://github.com/s/preprocessor
to used a pre-trained embeddings estimated over Wikipedia using a particular approach called
fastText [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>Convolutional Neural Network. Convolutional Neural Networks (CNN) are considered state
of the art in many text classi cation problem. Therefore, we decide to use them in a simple
architecture composed by a convolutional layer, followed by a Global Max Pooling layer and two
dense layers.</p>
        <p>
          Dilated KIM. This model is our new topology of CNN. It can be seen as an extension of Kim's
model [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] using the dilation ideas from computer graphics eld [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>The original Kim's model is a particular CNN where the convolutional layer has multiple lter
widths and feature maps. The complete architecture is illustrated in Figure 1, here the input
layer (i.e., sentence-matrix) is processed in a convolutional layer of multiple lters with di erent
width, each of these results are fed into Max Pooling layers and nally the concatenation of them
(previously atten to be dimensional coherent) is projected into a dense layer. Our extension is
to use a dilated lters in combination with normal ones, the intuition is that normal lter capture
adjacent words features, while dilated one are able to capture relations between non adjacent
words. This behaviour can't be achieved by the original Kim's model, because, even though the
lters size can be changed, they will capture only features from adjacent words.</p>
        <p>
          Regarding the architectural references in [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], the lter's number jf j and their dimension (k; d),
where k is the kernel size and d the dilation's unit, was optimized leading to the following results:
jf j = 5; f1 = (2 jvj; 0); f2 = (2 jvj; 3); f3 = (3 jvj; 1); f4 = (5 jvj; 1); f5 = (7 jvj; 1).
Recurrent neural network. Long Short Term Memory (LSTM) and Bidirectional LSTM are
types of Recurrent Neural Network (RNN) aiming at capture dynamic temporal behaviour. This
behaviour suggest us to use them for the stance detection, in particular we use straightforward
architectures made of an embedded input layer followed by an LSTM layer of 128 units, terminated
by a dense layer for both normal and bidirectional models.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Evaluation</title>
      <p>In this section we are going to illustrate the evaluation of developed systems regarding the
modules design reported in section 3. First we illustrate the metric proposed by organizers for
system's evaluation (Section 4.1), then we outline empirical results produced by a 10-fold cross
validation over the given data set (Section 4.2), nally we report our performance at the shared
task (Section 4.3).
4.1</p>
      <sec id="sec-2-1">
        <title>Metrics</title>
        <p>System evaluation metrics were given by the organizers and reported here in the following
equations (1) to (6). Their choice was to use an F1 macro measure for stance detection, due to class
unbalance, while a categorical accuracy for the gender detection.</p>
        <p>Gender = accuracy =</p>
        <p>P T P + P T N</p>
        <p>P sample
(1)</p>
        <p>Stance =</p>
        <p>F1 macro(F avor) + F1 macro(Against)
2
(2)
(5)
(6)
precision recall
precision + recall
(3)
(4)
precision =
recall =
1</p>
        <p>X P r(yl; y^l)
L
j j l2L
1</p>
        <p>X R(yl; y^l)
L
j j l2L
where L is the set of classes, yl is the set of correct label and y^l is the set of predicted labels.
4.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Fine tuning process</title>
        <p>Following, we describe the ne tuning process of our proposed model over possible combinations
of pre-processing (Table 3), then we compare Kim's model against our extension (Table 4) and
nally report the improvement over the use of a data augmentation technique (Table 5). For
brevity of information only the evaluation of Dilated Kim's model over Spanish stance detection
is reported, in details, the results are calculated from averaging three runs of a 10-fold cross
validation over the complete data set. Nevertheless, the results obtained after the ne tuning
process for all the models are reported in section 4.3, where their development performances are
compared against the ones obtained in the StanceCat task.</p>
        <p>Notations used in Table 3 refer to the one introduced in Section 3.1, where the listing of a
notation means its use for the reported result. Regarding the tweet speci c pre-processing, all
the items have been substituted, with the exception for URL and RW that have been removed.
We report the contribution of each analysed pre-processing alone.
{ Most of common used preprocessing decrease model's performance, meaning that their
information can be directly exploited by the model
{ Only stemming and mention removals brought small improvements, therefore they will be
used as best tuning for our proposed model.
ES</p>
        <p>CA</p>
        <p>Gender
ES</p>
        <p>CA
Kim 0:624( 0:017) 0:630( 0:022) 0:634( 0:011) 0:655( 0:017)</p>
        <p>Dilated Kim 0:658( 0:039) 0:659( 0:028) 0:652( 0:013) 0:715( 0:015)</p>
        <p>From the analysis of Table 4 we can outline a signi cant performance's improvement of our
proposed model respect the original Kim's model.</p>
        <p>Due to the fact that our development data set has few samples, to train our models we decided
to apply a data augmentation technique that didn't rely over external data rather exploit the
word embedding text representation. In details, we applied Gaussian noise to word embeddings
and after every convolutional layers, and, to further improve performances, we take advantage of
batch normalization. Best results were archived with additive zero centered noise, with a standard
deviation of 0.2. Regarding the batch normalization layers, default keras parameters were used.
Results of this technique respect the Dilated Kim's model are reported in table 5.
4.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Competition results</title>
        <p>For the system's submission, participants were allowed to send more than a model till a maximum
of 5 possible runs, therefore in tables 6 and 7 we report our best performing systems (tuned
following the process in section 4.2) for the StanceCat shared task.</p>
        <p>Unfortunately, due to a submission error caught only after the o cial results were published,
we didn't manage to be properly evaluated (in tables 6 and 7 the submitted models with name
in parentheses and asterisk), therefore after the closing we asked organizers to evaluate some of
our model to see how they would had performed (the models with test results in bold).
In this paper we have presented our participation in the Stance and Gender Detection in Tweets
on Catalan Independence (StanceCat) task at IberEval2017. Five distinct neural models were
explored, in combination with di erent types of preprocessing. From the ne tuning process we
System
System
derived that most of well know pre-processing technique are strongly model dependent,
meaning that the preprocessing pipeline has to be optimized depending on the classi er. Finally,
our proposal of a dilation technique for NLP task, the Dilated Kim's model, seems to increase
performances of CNN base classi ers.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Taule</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mart</surname>
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rangel</surname>
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bosco</surname>
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patti</surname>
            <given-names>V</given-names>
          </string-name>
          .
          <article-title>Overview of the task of Stance and Gender Detection in Tweets on Catalan Independence at IBEREVAL 2017</article-title>
          .
          <source>In: Notebook Papers of Workshop on SEPLN 2nd Workshop on Evaluation of Human Language Technologies for Iberian Languages, (IBEREVAL)</source>
          , Murcia, Spain,
          <source>September 19, CEUR Workshop Proceedings. CEUR-WS.org</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Kim</surname>
          </string-name>
          , Yoon. \
          <article-title>Convolutional neural networks for sentence classi cation</article-title>
          .
          <source>" arXiv preprint arXiv:1408.5882</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Joulin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Armand</surname>
          </string-name>
          , et al. \
          <article-title>Bag of tricks for e cient text classi cation</article-title>
          .
          <source>" arXiv preprint arXiv:1607.01759</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Edward</given-names>
            <surname>Loper</surname>
          </string-name>
          and
          <string-name>
            <given-names>Steven</given-names>
            <surname>Bird</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>NLTK: the Natural Language Toolkit</article-title>
          .
          <source>In Proceedings of the ACL-02 Workshop on E ective tools and methodologies for teaching natural language processing and computational linguistics - Volume 1 (ETMTNLP '02)</source>
          , Vol.
          <volume>1</volume>
          . Association for Computational Linguistics, Stroudsburg, PA, USA,
          <fpage>63</fpage>
          -
          <lpage>70</lpage>
          . DOI=http://dx.doi.org/10.3115/1118108.1118117
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bojanowski</surname>
          </string-name>
          , Piotr and Grave, Edouard and Joulin, Armand and Mikolov, Tomas. \
          <source>Enriching Word Vectors with Subword Information" arXiv preprint arXiv:1607.04606</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Harris</surname>
          </string-name>
          , Zellig S. \
          <source>Distributional structure." Word 10</source>
          .
          <fpage>2</fpage>
          -
          <lpage>3</lpage>
          (
          <year>1954</year>
          ):
          <fpage>146</fpage>
          -
          <lpage>162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Bazzanella</surname>
          </string-name>
          , Carla. \
          <article-title>Oscillazioni di informalita e formalita: scritto, parlato e rete." Formale e informale. La variazione di registro nella comunicazione elettronica</article-title>
          . Roma: Carocci (
          <year>2011</year>
          ):
          <fpage>68</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Cerruti</surname>
          </string-name>
          , Massimo, and Cristina Onesti. \
          <article-title>Netspeak: a language variety? Some remarks from an Italian sociolinguistic perspective." Languages go web: Standard and non-standard languages on the Internet (</article-title>
          <year>2013</year>
          ):
          <fpage>23</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Zhang, Ye, and Byron Wallace.
          <article-title>\A sensitivity analysis of (and practitioners' guide to) convolutional neural networks for sentence classi cation</article-title>
          .
          <source>" arXiv preprint arXiv:1510.03820</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>C. Bosco</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lai</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Patti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
          </string-name>
          (
          <year>2016</year>
          )
          <article-title>Tweeting in the Debate about Catalan Elections</article-title>
          .
          <source>In: Proc. LREC workshop on Emotion and Sentiment Analysis Workshop (ESA)</source>
          , LREC2016, Portoroz, Slovenia, May
          <volume>23</volume>
          -
          <issue>28</issue>
          , pp.
          <fpage>67</fpage>
          -
          <lpage>70</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Verhoeven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Daelemans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          (
          <year>2016</year>
          )
          <article-title>Overview of the 4th Author Pro ling Task at PAN 2016: Cross-Genre Evaluations</article-title>
          . In:
          <string-name>
            <surname>Balog</surname>
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cappellato</surname>
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferro</surname>
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macdonald</surname>
            <given-names>C</given-names>
          </string-name>
          . (Eds.)
          <article-title>CLEF 2016 Labs and Workshops, Notebook Papers</article-title>
          .
          <source>CEUR Workshop Proceedings. CEUR-WS.org</source>
          , vol.
          <volume>1609</volume>
          , pp.
          <fpage>750</fpage>
          -
          <lpage>784</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Mohammad</surname>
          </string-name>
          ,
          <string-name>
            <surname>Saif</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Parinaz</given-names>
            <surname>Sobhani</surname>
          </string-name>
          , and Svetlana Kiritchenko.
          <source>\Stance and sentiment in tweets." arXiv preprint arXiv:1605.01655</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Mohammad</surname>
          </string-name>
          ,
          <string-name>
            <surname>Saif</surname>
            <given-names>M.</given-names>
          </string-name>
          , et al. \
          <article-title>Semeval-2016 task 6: Detecting stance in tweets."</article-title>
          <source>Proceedings of SemEval 16</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Fisher</surname>
            <given-names>Yu</given-names>
          </string-name>
          , Vladlen Koltun ,
          <article-title>\Multi-Scale Context Aggregation by Dilated Convolutions"</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>