<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>FACT2020: Factuality Identification in Spanish Text</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Arturo Collazo</string-name>
          <email>arturo.collazo@fing.edu.uy</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Agustín Rieppi</string-name>
          <email>agustin.rieppi@fing.edu.uy</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tiziana Romani</string-name>
          <email>tiziana.romani@fing.edu.uy</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guillermo Trinidad</string-name>
          <email>gtrinidad@fing.edu.uy</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Facultad de Ingeniería, Universidad de la República Montevideo</institution>
          ,
          <country country="UY">Uruguay</country>
        </aff>
      </contrib-group>
      <fpage>206</fpage>
      <lpage>213</lpage>
      <abstract>
        <p>In this article we present our proposal for the FACT (Factuality Analysis and Classification Task) challenge tasks 1 and 2. The objective of task1 is to create a system capable of classifying given events found in Spanish texts. Although we present several approaches, the best performing classifier takes an approach of recurrent neural networks trained with embeddings data about the event word and its surroundings, reporting a F1 macro score of 0.6. For task2, a simple rule-base modeling approach is used, reaching a F1 macro score of 0.84.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Factuality classification</kwd>
        <kwd>Factuality identification</kwd>
        <kwd>FACT</kwd>
        <kwd>NLP</kwd>
        <kwd>Neural networks</kwd>
        <kwd>Random Forest classiifer</kwd>
        <kwd>Word embeddings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Factuality Classification</title>
      <sec id="sec-2-1">
        <title>2.1. Task1 description</title>
        <p>Amongst identifiable characteristics in events, factuality would be whether it is certain or
uncertain if an event happened. For us, the classification is divided in 3 categories: certain
events that happened (facts), certain events that did not happen (counter-facts), uncertain events
(undefined). The objective is then to train a classifier that can predict the factuality category for
tagged events in a given text. Training texts were obtained from Spanish and Uruguayan media.</p>
        <sec id="sec-2-1-1">
          <title>2.1.1. Initial Data Preprocessing</title>
          <p>The training texts are stored in XML files, so the first step was to extract the text to some
format we could work on. A long string was created, containing every sentence from the corpus
with marked events. This was later split on individual sentences, including the same sentence
as many times as the number of marked events on it, having up to one event represented on each.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. RNN + Word Embeddings</title>
        <sec id="sec-2-2-1">
          <title>2.2.1. Preprocessing</title>
          <p>
            After having the structure described before, sentences get split into words, and each word is
translated to Word Embeddings, making use of an embeddings file trained using word2vec over
the corpus from [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ]. So words turn out as 300-dimensional vectors. In order to distinguish the
event word in the sentence, an extra bit is added to this 300-dimension making it 301. This bit
value depends on whether the word is an event or not.
          </p>
          <p>
            Other approaches included representing words as POS-Tags. This was achieved with nltk [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ]
Stanford POS-Tagger [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ], and vectors were represented using an internal form of vector
codification, based on tag classes and attributes. The output had a considerably low accuracy.
2.2.2. Padding
          </p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.3. Class weights</title>
          <p>
            Next step was padding the sentences in order to normalize the length of each input for the
neural network. This is done applying as much padding as the longest sentence available.
Given the training corpus was extremely unbalanced, weighting classes seemed accurate. The
weights got from the training corpus are:
- facts: 0.9121
- counter-facts: 2.1925
- undefined: 11.3884
2.2.4. RNN
For the neural network implementation, TensorFlow [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ] Keras library [6] is used. The
sequential model, consists of a GRU layer with 200 neurons and a dense layer with a 3-dimensional
output, representing each of the possible categories. The model is compiled using a
categorical_crossentropy loss function, and adam optimizer.
          </p>
          <p>For choosing the correct amount of epochs, an early stopping we used. We concluded the best
amount of epochs was between 25 and 30. So they are set to 28. Similarly some exploratory
testing is done on batch size, after trying diferent values, 30 is chosen as one of the possible
values for batch size.
2.2.5. Results
from this approach are:
• Precision: 0.611
• Recall: 0.603
• F1-macro: 0.607
• Accuracy: 0.848</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. RNN + char level</title>
        <sec id="sec-2-3-1">
          <title>2.3.1. Preprocessing</title>
          <p>This technique is strongly based in Aspie96[7], where the unit of information is the character
of an event and its neighbors, including spaces and non word characters.</p>
          <p>First, each event is divided into a single char list, as they left and right neighbor to be
concatenated in a greater one, if they fit into the window. Then, that list with the event in its center is
encoded on every character with one hot encoding (the representation for each character was
retrieved by a dictionary with all possible characters that may appear in the sources). Last, to
recognize the characters that were part of the event, all of them were marked with a flag for
that purpose.
2.3.2. RNN
For the neural network implementation, TensorFlow Keras library [6] is used. The sequential
model, consists in two LSTM layers with 75 units each one with and a dense layer with an
3-dimensional output. Sigmoid as activation function, categorical crossentropy as loss function
and Adam as optimizer. Besides, the training has a boundary of 14 epochs without improvement
over a maximum of 150 epochs.</p>
        </sec>
        <sec id="sec-2-3-2">
          <title>2.3.3. Results analysis</title>
          <p>In the table below, it can be seen how precision and recall decreases between the datasets, which
leads to a decrease of f1 metric as well. The lost of performance could be based in the diference
on the datasets structures, and a probable little overfitting on the training stage.</p>
        </sec>
        <sec id="sec-2-3-3">
          <title>2.3.4. Future work</title>
          <p>For future work on this approach, several configurations on the model can be done to achieve
better results. These include modifying the window size, and customizing hyper parameters
related to the model such as recurrent activation function, number of units in each layer or the
optimizer.</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Random Forest with Tag counts</title>
        <sec id="sec-2-4-1">
          <title>2.4.1. Preprocessing</title>
          <p>This approach is based on a morphological analysis of the context on each event. For every event
we contemplate the event itself and a fixed window at a backward word level. Then a
Part-ofspeech tagging (POS Tag) is made to the extracted sentence using the Spacy Spanish POS-tagger1.
Among a lot of information the tagger returns for each word a label which identifies what kind
of POS-tag every word has. With the given information a count of the number of apparitions
of every POS-Tag, also it is possible to count more than once the POS-Tag of the event to classify.
Count results are mapped to an array, on which every position represents a POS-Tag. The
resulting array will be the input for the classifier.</p>
        </sec>
        <sec id="sec-2-4-2">
          <title>2.4.2. Random Forest</title>
          <p>For this model we use Random Forest from sklearn [8] to classify2, which receives the described
input as an entry, and has as output 3 possible values corresponding to the task.
At the stage of model training, the values of the window length and the times that the event’s
POS-Tag is counted are tuned. The configuration which gives better results is counting twice
the event’s POS-Tag and using a window length of two words.</p>
          <p>1https://spacy.io/models/es
2https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html</p>
        </sec>
      </sec>
      <sec id="sec-2-5">
        <title>2.5. SVC with Tag counts + Word Embeddings</title>
        <sec id="sec-2-5-1">
          <title>2.5.1. Preprocessing</title>
          <p>
            This approach is an extension of Random Forest with windowed input described before. The
main idea is to add more information about the event itself. To achieve this a word embedding
representation of the event is concatenated to each input, making use of an an embeddings file
trained using word2vec over the corpus from [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ] to do the encode.
2.5.2. SVC
The first attempt was using a Random Forest to classify, however the results were not good.
The second attempt was with an SVC classifier from Sklearn 3, which gave better results than
Random Forest and the approach with Random Forest with windowed input.
For the training the values of the window length and the numbers of times that the event’s
POS-Tag is counted are tune. The best configuration was using a windows length of two words
and counting one time the event’s POS-Tag.
          </p>
        </sec>
      </sec>
      <sec id="sec-2-6">
        <title>2.6. Results</title>
        <p>The next table shows metric results over the test data, for each of the models described before.
We can observe how the RNN plus Word Embeddings and the SVC approach ended up head to
head, with 0.015 diference in F1 metric and 0.017 diference in accuracy. One interesting thing
to see is how the SVC’s precision was a little higher than the RNN, yet the RNN won over recall.
Each of the models had a much higher f1 metric than the baseline project.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Event Identification</title>
      <sec id="sec-3-1">
        <title>3.1. Task2 description</title>
        <p>This task is the previous step of Task1, aiming to automatically identify events in a given text.
The input is plain text and the given algorithm has to output the index of words which represent
events in it.</p>
        <p>3https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
For instance, if the input is El/1 volcán/2 de/3 Fuego/4 ha/5 vuelto/6 a/7 la/8 normalidad/9 ,/10
aunque/11 mantiene/12 explosiones/13 moderadas/14. It should output 5, 6, 12, 13.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Baseline</title>
        <p>The competition organizers proposed a simple algorithm in order to set a baseline for the
competitors. This classifiers assigns the class ’event’ to the words tagged as ’event’ at least once
in the training corpus. This approach gets a F1-score of 0.597.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Verbs detection</title>
        <p>There are two types of event, verbal and noun. Studying the training corpus (same used in
task1) the team noticed that noun events are just 16.5% of the total. This motivates a simple
rules approach, where the classifier identifies a word as an event if it is a verb.</p>
        <p>The code is as simple as described, using nltk Stanford POS-Tagger and checking if it
determines that the word to classify is a verb or not.</p>
        <p>Three metrics are used for the task evaluation, the results obtained for this approach are:
• Precision: 0.993
• Recall: 0.736
• Macro-f1: 0.845</p>
        <p>The high precision is due to the fact that almost every verb is an event, making those
predictions trivial. Having a high recall means that the test corpus is also unbalanced and most of the
events are verbal, as the team proposed.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Verbs detection + Nouns detection</title>
        <p>In order to include noun events to the classifier and inspired by the baseline approach, this rule
is also added to the algorithm, assigning the class of ’event’ to verbs and to nouns that appear
at least once as events in the training corpus.</p>
        <p>This approach beats the previous one, obtaining a higher macro-f1 score, caused by a much
better recall:
• Precision: 0.950
• Recall: 0.792
• Macro-f1: 0.864</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Future work</title>
        <p>The good results obtained with such a simple approach are promising, due to a lack of time
the team could not explore more complex solutions, but it would be interesting to test machine
learning techniques to identify more noun events.</p>
        <p>Although it is well known that noun events are context dependent, the results obtained in the
second approach are proof of this, reducing precision (this means we have more false positives).
Using some ideas from Task1, it would be interesting to use windows around the words to
classify them. This would give the classifiers the possibility to learn from context, rather than
just the words.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions</title>
      <p>For the firs task of Factuality Classification four approaches were implemented, the best
performing (using the macro-f1 score as reference) is Recurrent Neural Network combined with
Word Embeddings, which obtained a macro-f1 score of 0.607.</p>
      <p>The second task of Event identification was attempted with two rules classifiers, due to it’s
known characteristics. The one that got the best results was based on two rules: word w is an
event if (and only if) (1) w is a verb, or (2) w appeared in the training corpus as an event. This
classifier obtained a macro-f1 score of 0.864.</p>
      <p>For the latest task, the team believes that the use of context and some more complex techniques
could greatly improve the obtained results.
[6] F. Chollet, Keras, https://github.com/fchollet/keras, 2015.
[7] V. Giudice, Aspie96 at FACT (IberLEF 2019): Factuality Classification in Spanish Texts
with Character-Level Convolutional RNN and Tokenization, in: Proceedings of the Iberian
Languages Evaluation Forum (IberLEF 2019), CEUR Workshop Proceedings, CEUR-WS,
Bilbao, Spain, 2019.
[8] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel,
P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher,
M. Perrot, E. Duchesnay, Scikit-learn: Machine Learning in Python, Journal of Machine
Learning Research 12 (2011) 2825–2830.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rosá</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Alonso</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Castellón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chiruzzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Curell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fernández</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Góngora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Malcuori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Vázquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wonsever</surname>
          </string-name>
          , Overview of FACT at IberLEF 2020:
          <article-title>Events Detection and Classification (</article-title>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Azzinnari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Martínez</surname>
          </string-name>
          , Representación de Palabras en Espacios de Vectores, Proyecto de grado, Universidad de la República, Uruguay,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Loper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bird</surname>
          </string-name>
          ,
          <article-title>Nltk: the natural language toolkit</article-title>
          ,
          <source>arXiv preprint cs/0205028</source>
          (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Klein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Singer</surname>
          </string-name>
          ,
          <article-title>Feature-rich part-of-speech tagging with a cyclic dependency network, in: Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology-volume 1</article-title>
          , Association for Computational Linguistics,
          <year>2003</year>
          , pp.
          <fpage>173</fpage>
          -
          <lpage>180</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Abadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Brevdo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Citro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Devin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Harp</surname>
          </string-name>
          , G. Irving,
          <string-name>
            <given-names>M.</given-names>
            <surname>Isard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jozefowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kudlur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Levenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mané</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Monga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Murray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Olah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schuster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shlens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Steiner</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Talwar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vanhoucke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vasudevan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Viégas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vinyals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Warden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wattenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wicke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <source>TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems</source>
          ,
          <year>2015</year>
          . URL: http://tensorflow.org/, software available from tensorflow.
          <source>org.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>