=Paper= {{Paper |id=Vol-2421/NEGES_overview |storemode=property |title=NEGES 2019 Task: Negation in Spanish |pdfUrl=https://ceur-ws.org/Vol-2421/NEGES_overview.pdf |volume=Vol-2421 |authors=Salud María Jiménez-Zafra,Noa Patricia Cruz Díaz,Roser Morante,María-Teresa Martín-Valdivia |dblpUrl=https://dblp.org/rec/conf/sepln/ZafraDMV19 }} ==NEGES 2019 Task: Negation in Spanish== https://ceur-ws.org/Vol-2421/NEGES_overview.pdf
          NEGES 2019 Task: Negation in Spanish

        Salud Marı́a Jiménez-Zafra1[0000−0003−3274−8825] , Noa Patricia Cruz
        Dı́az2[0000−0002−6685−6747] , Roser Morante3[0000−0002−8356−6469] , and
                  Marı́a-Teresa Martı́n-Valdivia1[0000−0002−2874−0401]
    1
        SINAI, Computer Science Department, CEATIC, Universidad de Jaén, Spain
                             {sjzafra,maite}@ujaen.es
            2
               Centro de Excelencia de Inteligencia Artificial, Bankia, Spain
                                contact@noacruz.com
        3
          CLTL Lab, Computational Linguistics, VU University Amsterdam, The
                                     Netherlands
                               r.morantevallejo@vu.nl


         Abstract. This paper presents the 2019 edition of the NEGES task,
         Negation in Spanish, held on September 24 as part of the evaluation fo-
         rum IberLEF in the 35th International Conference of the Spanish Society
         for Natural Language Processing. In this edition, two sub-tasks were pro-
         posed: Sub-task A: “Negation cues detection” and Sub-task B: “Role of
         negation in sentiment analysis”. The dataset used for both sub-tasks was
         the SFU ReviewSP -NEG corpus. About 13 teams showed interest in the
         task and 5 teams finally submitted results.

         Keywords: NEGES 2019 · negation · negation processing · cue detec-
         tion · sentiment analysis.


1       Introduction
Negation is a complex linguistic phenomenon that has been widely studied from
a theoretical perspective [16, 17], and less from an applied point of view. How-
ever, interest in the computational treatment of this phenomenon is of growing
interest, because it is relevant for a wide range of Natural Language Processing
applications such as sentiment analysis or information retrieval, where it is cru-
cial to know when the meaning of a part of the text changes due to the presence
of negation. In fact, in recent years, several challenges and shared tasks have
focused on negation processing: the BioNLP’09 Shared Task 3 [20], the NeSp-
NLP 2010 Workshop: Negation and Speculation in Natural Language Processing
[25], the CoNLL-2010 shared task [12], the i2b2 NLP Challenge [30], the *SEM
2012 Shared Task [24], the ShARe/CLEF eHealth Evaluation Lab 2014 Task 2
[27], the ExProM Workshop: Extra-Propositional Aspects of Meaning in Com-
putational Linguistics [26, 6, 4] and the SemBEaR Workshop: Computational
Semantics Beyond Events and Roles [5, 1].
    Copyright c 2019 for this paper by its authors. Use permitted under Creative Com-
    mons License Attribution 4.0 International (CC BY 4.0). IberLEF 2019, 24 Septem-
    ber 2019, Bilbao, Spain.
            Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




    However, most of the research on negation has been done for English. There-
fore, the aim of NEGES task4 is to advance the study of this phenomenon in
Spanish, the second most widely spoken language in the world and the third
most widely used on the Internet. The 2018 edition consisted of three tasks re-
lated to different aspects of negation [18]: Task 1 on reaching an agreement on
the guidelines to follow for the annotation of negation in Spanish, Task 2 on
identifying negation cues, and Task 3 on evaluating the role of negation in senti-
ment analysis. A total of 4 teams participated in the workshop, 2 for developing
annotation guidelines and 2 for negation cues detection. Task 3 had no partici-
pants. In this edition, the objective is to continue bringing together the scientific
community that is working on negation to discuss how it is being addressed,
what are the main problems encountered, as well as sharing resources and tools
aimed at processing negation in Spanish.
    The rest of this paper is organized as follows. The proposed sub-tasks are
described in Section 2, and the data used is detailed in Section 3. Evaluation
measures are introduced in Section 4. Participating systems and their results are
summarized in Section 5. Finally, Section 6 concludes the paper.


2      Task description
In the 2019 edition of NEGES task, Negation in Spanish, two sub-tasks were
proposed as a continuation of the tasks carried out in NEGES 2018 [18].
 – Sub-task A: “Negation cues detection”
 – Sub-task B: “Role of negation in sentiment analysis”
      The following is a description of each sub-task.

2.1     Sub-task A: Negation cues detection
Sub-task A of NEGES 2019 had the aim to promote the development and evalu-
ation of systems for identifying negation cues in Spanish. Negation cues could be
simple, if they were expressed by a single token (e.g., “no” [no/not], “sin” [with-
out] ), continuous, if they were composed of a sequence of two or more contiguous
tokens (e.g., “ni siquiera” [not even], “sin ningún” [without any] ), or discontin-
uous, if they consisted of a sequence of two or more non-contiguous tokens (e.g.,
“no...apenas” [not...hardly], “no...nada” [not...nothing] ). For example, in sen-
tence (1) the systems had to identify four negation cues: i) the discontinuous
cue “No...nada” [Not...nothing], ii) the simple cue “no” [no/not], iii) the simple
cue “no” [no/not] again, and iv) the continuous cue “ni siquiera” [not even].

(1)     No1 tengo nada1 en contra del servicio del hotel, pero no2 pienso volver,
       no3 me ha gustado, ni siquiera4 las vistas son buenas.
       I have nothing against the service of the hotel, but I do not plan to return,
       I did not like it, not even the views are good.
4
    http://www.sepln.org/workshops/neges2019/




                                            330
          Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




    Participants received a set of training and development data consisting of
reviews of movies, books and products from the SFU ReviewSP -NEG corpus[19]
to build their systems during the development phase. At a later stage, a set of
tests were made available for evaluation. Finally, the participant’s submissions
were evaluated against the gold standard annotations. It should be noted that
the data sets used in this sub-task were manually annotated with negation cues
by domain experts, following well-defined annotation guidelines [19, 23].


2.2   Sub-task B: Role of negation in sentiment analysis

Sub-task B of NEGES 2019 proposed to evaluate the impact of accurate negation
detection in sentiment analysis. In this task, participants had to develop a system
that used the negation information contained in a corpus of reviews of movies,
books and products, the SFU ReviewSP -NEG corpus [19], to improve the task
of polarity classification. They had to classify each review as positive or negative
using an heuristic that incorporated negation processing. For example, systems
should classify a review such as (2) as negative using the negation information
provided by the organization, a sample of which is shown in Figure 1.

(2)    El 307 es muy bonito, pero no os lo recomiendo. Por un fallo eléctrico te
      puedes matar en la carretera.
      The 307 is very nice, but I don’t recommend it. An electrical failure can kill
      you on the road.




                Fig. 1. Review annotated with negation information.




                                          331
           Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




3     Data
The SFU ReviewSP -NEG corpus5 [19] was the collection of documents provided
for training and testing the systems in Sub-task A and Sub-task B 6 . This corpus
is an extension of the Spanish part of the SFU Review corpus [29] and it could
be considered the counterpart of the SFU Review Corpus with negation and
speculation annotations [21].
    The Spanish SFU Review corpus [29] consists of 400 reviews extracted from
the website Ciao.es that belong to 8 different domains: cars, hotels, washing ma-
chines, books, cell phones, music, computers, and movies. For each domain there
are 50 positive and 50 negative reviews, defined as positive or negative based on
the number of stars given by the reviewer (1-2=negative; 4-5=positive; 3-star
review were not included). Later, it was extended to the SFU ReviewSP -NEG
corpus [19] in which each review was automatically annotated at the token level
with fine and coarse PoS-tags and lemmas using Freeling [28], and manually an-
notated at the sentence level with negation cues and their corresponding scopes
and events. Moreover, it is the first Spanish corpus in which it was annotated
how negation affects the words within its scope, that is, whether there is a change
in the polarity or an increase or decrease of its value. Finally, it is important
to note that the corpus is in XML format and it is freely available for research
purposes.

3.1    Datasets Sub-task A
The SFU ReviewSP -NEG corpus [19] was randomly splitted into development,
training and test sets with 33 reviews per domain in training, 7 reviews per do-
main in development and 10 reviews per domain in test. The data was converted
to CoNLL format [7] where each line corresponds to a token, each annotation
is provided in a column and empty lines indicate the end of the sentence. The
content of the given columns is:
 – Column 1: domain filename
 – Column 2: sentence number within domain filename
 – Column 3: token number within sentence
 – Column 4: word
 – Column 5: lemma
 – Column 6: part-of-speech
 – Column 7: part-of-speech type
 – Columns 8 to last: if the sentence has no negations, column 8 has a “***”
   value and there are no more columns. Else, if the sentence has negations, the
   annotation for each negation is provided in three columns. The first column
   contains the word that belongs to the negation cue. The second and third
   columns contain “-”.
5
    http://sinai.ujaen.es/sfu-review-sp-neg-2/
6
    To download the data in the format provided for Sub-task A and Sub-task B go to
    http://www.sepln.org/workshops/neges2019/ or send an email to the organizers




                                           332
          Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




   Figure 2 and Figure 3, show examples of the format of the files with different
types of sentences. In the first example (Figure 2) there is no negation so the
8th column is “***” for all tokens, whereas the second example (Figure 3) is a
sentence with two negation cues in which information for the first negation is
provided in columns 8-10, and for the second in columns 11-13.




                Fig. 2. Sentence without negation in CoNLL format.




               Fig. 3. Sentence with two negations in CoNLL format.



   The distribution of reviews and negation cues in the datasets is provided
in Table 1: 264 reviews with 2,511 negation cues for training the systems, 56
reviews with 594 negation cues for the tuning process, and 80 reviews with 836
negation cues for the final evaluation.




                                          333
         Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




 Table 1. Distribution of reviews and negation cues in the datasets of Sub-task A.

                                          Reviews Negation cues
                Training                       264         2,511
                Development                     56           594
                Test                            80           836
                Total                          400         3,941




3.2   Datasets Sub-task B

For this sub-task, we provided the SFU ReviewSP -NEG corpus [19] with the
original format (XML). The meaning of the labels found in the reviews are the
following:

 – . It describes the polarity of the re-
   view, which can be “positive” or “negative”.
 – . This label corresponds to a complete phrase
   or fragment thereof in which a negation structure can appear. It has associ-
   ated the complex attribute that can take one of the following values:
     • “yes”, if the sentence contains more than one negation structure.
     • “no”, if the sentence only has a negation structure.
 – . This label corresponds to a syntactic structure in which a
   negation cue appears. It has 4 possible attributes, two of which (change and
   polarity modifier ) are mutually exclusive.
     • polarity: it presents the semantic orientation of the negation structure
       (“positive”, “negative” or “neutral”).
     • change: it indicates whether the polarity or meaning of the negation
       structure has been completely changed because of the negation (change=
       “yes”) or not (change=“no”).
     • polarity modifier: it states whether the negation structure contains an
       element that nuances its polarity. It can take the value “increment” if
       there is an increment in the intensity of the polarity or, on the contrary,
       it can take the value “reduction” if there is a reduction of it.
     • value: it reflects the type of the negation structure, that is, “neg” if
       it expresses negation, “contrast” if it indicates contrast or opposition
       between terms, “comp” if it expresses a comparison or inequality between
       terms or “noneg” if it does not negate despite containing a negation cue.
 – . This label delimits the part of the negation structure that is within
   the scope of negation. It includes both, the negation cue () and
   the event ().
 – . It contains the word(s) that constitute(s) the negation cue. It
   can have associated the attribute discid if negation is represented by discon-
   tinuous words.
 – . It contains the words that are directly affected by the negation
   (usually verbs, nouns or adjectives).




                                         334
            Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




   The distribution of reviews in the training, development and test sets is
provided in Table 2, as well as the distribution of the different negation structures
per dataset. The total of positive and negative reviews can be seen in the rows
named as + Reviews and - Reviews, respectively.


    Table 2. Distribution of reviews and negation cues in the datasets of Sub-task B.

                             Training         Dev.          Test       Total
               Reviews              264            56         80          400
               + Reviews            134            22         44          200
               - Reviews            130            34         36          200
               neg                2.511           594        836        3,941
               noneg                104            22         55          181
               contrast             100            23         52          175
               comp                  18             6          6           30




4      Evaluation measures

The evaluation script used to evaluate the systems presented in Sub-task A was
the same as the one used to evaluate the *SEM 2012 Shared Task: “Resolving
the Scope and Focus of Negation” [24]. It is based on the following criteria:

 – Punctuation tokens are ignored.
 – A True Positive (TP) requires all tokens of the negation element have to be
   correctly identified.
 – To evaluate cues, partial matches are not counted as False Positive (FP),
   only as False Negative (FN). This is to avoid penalizing partial matches
   more than missed matches.

    The measures used to evaluate the systems were Precision (P), Recall (R) and
F-score (F1). In the proposed evaluation, FN are counted either by the system
not identifying negation cues present in the gold annotations, or by identifying
them partially, i.e., not all tokens have been correctly identified or the word
forms are incorrect. FP are counted when the system produces a negation cue not
present in the gold annotations and TP are counted when the system produces
negation cues exactly as they are in the gold annotations.
    For evaluating Sub-task B, the traditional measures used in text classi-
fication were applied: P, R, F1 and Accuracy (Acc). P, R and F1-score were
measured per class and averaged using macro-average method.

                                             TP
                                     P =                                           (1)
                                           TP + FP




                                            335
          Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




                                            TP
                                     R=                                                (2)
                                          TP + FN

                                              2P R
                                      F1 =                                             (3)
                                             P +R

                                          TP + TN
                           Acc =                                                       (4)
                                     TP + TN + FP + FN

5   Participants

13 teams showed interested and 5 teams submitted results.
    Sub-task A had 4 participants: Aspie96 from the University of Turin, the
CLiC team from Universitat de Barcelona, the IBI team from Integrative Biomed-
ical Informatics group of Universitat Pompeu Fabra, and the UNED team from
Universidad Nacional de Eduación a Distancia (UNED) and Instituto Mixto de
Investigación-Escuela Nacional de Sanidad (IMIENS). The official results by do-
main are shown in Table 3, and overall results are presented in Table 4, both
evaluated in terms of P, R and F1. For IBI and UNED teams the domain in
which it was most difficult to detect the negation cues was that of cell phones
reviews, while for Aspie96 and CLiC it was the domain of hotels and books re-
views, respectively. In terms of overall performance, the results of Aspie96 were
quite low compared to the other teams. CLiC, IBI and UNED team obtained
similar precision. However, the CLiC team achieved the highest recall, reaching
the first rank position.


                 Table 3. Official results by domain for Sub-task A.

                        Aspie96           CLiC               IBI             UNED
 Domain             P     R     F1    P    R   F1       P     R    F1    P    R   F1
 Books            16.00 28.57 20.51 80.59 75.79 78.12 80.97 72.62 76.57 84.02 81.35 82.66
 Cars             19.42 29.41 23.39 94.92 82.35 88.19 92.73 75.00 82.93 94.83 80.88 87.30
 Cell phones      18.07 26.32 21.53 87.76 75.44 81.13 90.48 66.67 76.77 88.37 66.67 76.00
 Computers        17.36 25.93 20.80 90.48 93.83 92.12 89.06 70.37 78.62 94.12 79.01 85.91
 Hotels           10.59 15.25 12.5 87.5 71.19 78.51 97.67 71.19 82.35 93.62 74.58 83.02
 Movies           20.53 33.13 25.35 88.67 81.60 84.99 90.30 74.23 81.48 89.86 81.60 85.53
 Music            24.17 33.33 28.02 94.44 78.16 85.53 94.20 74.71 83.33 95.38 71.26 81.57
 Washing machines 24.24 34.78 28.57 92.98 76.81 84.13 94.34 72.46 81.96 94.34 72.46 81.96




   Aspie96 [15] presented a model based in a convolutional Recurrent Neural
Network (RNN) previously used for irony detection in Italian tweets [13] at
IronITA shared task [8]. In order to address the task at NEGES, the system
was modified to take tokens and Spanish spelling into account. Each word was
represented using a 50-character window in which non-word tokens were also




                                          336
          Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




                   Table 4. Overall official results for Sub-task A.

                     Team            P            R         F1
                     CLiC          89.67         79.40    84.09
                     UNED          91.82         75.98    82.99
                     IBI           91.22         72.16    80.50
                     Aspie96       18.80         28.34    22.58




considered. The words were then fed into a GRU layer to expand the context.
The GRU layer’s output was fed to a classifier that classified each word as not
part of a negation cue, the first word of a negation cue or part of the latest started
negation cue. A similar model was shown to be suitable for the classification of
irony [13] and factuality [14], but for negation it is not. The results of the task
are quite low compared to other competing systems.
    The CLiC team [3] developed a system based on the Conditional Random
Field (CRF) algorithm, inspired in the system of Loharja et al. (2018) [22]
presented in NEGES 2018 [18], which achieved the best results. They used as
features the word forms and PoS tags of the actual word, the posterior word
and the previous 6 words. They also conducted experiments including two post-
processing methods: a set of rules and a vocabulary list composed of candidate
cues extracted from an annotated corpus (NewsCom). Neither the rules nor
the list of candidates boost basic CRF’s results during the development phase.
Therefore, they presented to the competition the CRF model without post-
processing, achieving the first position in the rank.
    The IBI team [9] experimented with four supervised learning approaches
(CRF, Random Forest, Support Vector Machine with linear kernel and XG-
Boost) using shallow textual, lemma, PoS tags and dependency tree features
to characterize each token. For Random Forest, Support Vector Machine with
linear kernel and XGBoost they also used the same set of features for the three
previous and three posterior tokens in order to model the context of the to-
ken in focus. The highest performance during the development phase was the
one grounded by the CRF approach. Therefore, they chosen it to support their
participation, reaching the third rank position in the competition.
    The UNED team [10] participated in the sub-task with a system based on
Deep Learning, which is an evolution of the system presented in the previous
edition of this workshop [11]. Specifically, they proposed a BILSTM-based model
using words, PoS tags and characters embedding features, and a one-hot vector
to represent casing information. Moreover, they included in the system a post-
processing phase in which some rules were used to correct frequent errors made
by the network. The results obtained represent an improvement in relation to
those of the 2018 edition of NEGES [18] and place them in the second position
this year.
    Sub-task B had 1 participant: LTG-Oslo from University of Oslo. The of-
ficial results per sentiment class (positive and negative) and overall results are




                                           337
          Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




shown in Table 5. The results for the positive class are better than those of the
negative class and, overall, they do not give a strong performance in absolute
numbers, but the proposed approach is very interesting. LTG-Oslo [2] addressed
the task using a multi-task learning approach where a single model is trained si-
multaneously to negation detection and sentiment analysis. Specifically, shared
lower-layers in a deep Bidirectional Long Short-Term Memory network (BiL-
STM) were used to predict negation, while the higher layers were dedicated to
predicting sentiment at document-level.


                      Table 5. Official results for Sub-task B.

                                       LTG-Oslo
                          P               R                F1              Acc
    Positive class       68.90           70.50            69.70              -
    Negative class       62.90           61.10            62.00              -
    Overall              65.90           65.80            65.85           66.20




6   Conclusions

This paper presents the description of the 2019 edition of NEGES task, whose
aim is to continue working on advancing the state-of-the-art of negation detection
in Spanish. Exactly, this edition consisted of 2 of the 3 sub-tasks carried out in
the previous edition: Sub-task A: “Negation cues detection” and Sub-task B:
“Role of negation in sentiment analysis”, in both using the SFU ReviewSP -NEG
corpus [19] to train and test the systems presented.
    Compared to the previous edition, this year the workshop has attracted more
attention, with more teams interested in participating in it (15 vs. 10). In ad-
dition, despite including one less sub-task, the number of submissions has been
higher: In the 2018 edition of NEGES, a total of 4 teams participated in the
workshop, 2 for developing annotation guidelines and 2 for cues detection. The
task of studying the role of negation in sentiment analysis had no participants.
This year, 5 teams submitted results, 4 for identifying negation cues and 1 for
studying the role of negation in sentiment analysis. The low number of submis-
sions in the last sub-task may be due to the fact that in order to study the
impact of accurate negation detection in sentiment analysis it is necessary to
determine how to efficiently represent negation, in the case of machine learning
systems, or how to modify the polarity of the words within the scope of negation
in the case of lexicon-based systems.
    Regarding the approaches followed to detect negation cues, it seems that the
teams continue to opt indistinctly for both more traditional machine learning
approaches and deep learning algorithms, confirming that the use of Conditional
Random Fields obtains the best results in this sub-task.




                                          338
          Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




     Concerning the system errors and difficulties encountered in the identification
of negation cues, we can say the following. Aspie96 reported that the low results
of its system could be due to the fact that only the text of the documents had
been taken into account, without incorporating features such as the lemma and
the PoS tags of the words, which could be of help. In fact, the other teams
used them and obtained good results. The CLiC team reported several types of
errors: errors in identifying negation cues that do not express negation (e.g. “Ya
estaba casi, no (B)?” [It was almost there, wasn’t it?] ); not correctly identifying
continuous cues (e.g. “a no ser que” [unless], “a excepción de” [with the exception
of ], “a falta de” [in the absence of ] ); tagging elements such as “tan” [so], “tanto”
[so much], “muy” [very] or “mucho” [much] in discontinuous cues; and not
detecting discontinuous cues. The IBI team detected that the performance of
the approaches tested drastically decreases when they deal with multi-token
negation cues. The UNED team also found it more difficult to identify multiple-
term negation cues.
     As for the difficulties and errors in the evaluation of the role of negation in
sentiment analysis, LTG-Oslo stated that given the fact that the task is per-
formed at the document level, it is difficult to determine them exactly. However,
it is concluded that the multi-task model (MTL) is better than the single-task
sentiment model (STL) for this sub-task and that the training size and different
domains complicate the use of deep neural architectures.
     Future editions of the workshop will also focus on detecting negation in other
domains such as biomedical and studying other components of negation, such as
the scope. Moreover, authors will have to include an error analysis of the results
presented.


Acknowledgements

This work has been partially supported by a grant from the Ministerio de Edu-
cación Cultura y Deporte (MECD - scholarship FPU014/00983), Fondo Europeo
de Desarrollo Regional (FEDER), and REDES project (TIN2015-65136-C2-1-R)
and LIVING-LANG project (RTI2018-094653-B-C21) from the Spanish Govern-
ment. RM is supported by the Netherlands Organization for Scientific Research
(NWO) via the Spinoza-prize awarded to Piek Vossen (SPI 30-673, 2014-2019).


References
 1. Proceedings of the Workshop on Computational Semantics beyond
    Events and Roles. Association for Computational Linguistics, New
    Orleans,    Louisiana    (Jun    2018).    https://doi.org/10.18653/v1/W18-13,
    https://www.aclweb.org/anthology/W18-1300
 2. Barnes, J.: LTG-Oslo Hierarchical Multi-task Network: The Importance of Nega-
    tion for Document-level Sentiment in Spanish. In: Proceedings of the Iberian Lan-
    guages Evaluation Forum (IberLEF 2019). CEUR Workshop Proceedings, CEUR-
    WS, Bilbao, Spain (2019)




                                          339
           Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




 3. Beltrán, J., González, M.: Detection of Negation Cues in Spanish: The CLiC-Neg
    System. In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF
    2019). CEUR Workshop Proceedings, CEUR-WS, Bilbao, Spain (2019)
 4. Blanco, E., Morante, R., Saurı́, R.: Proceedings of the workshop on extra-
    propositional aspects of meaning in computational linguistics (exprom). In: Pro-
    ceedings of the Workshop on Extra-Propositional Aspects of Meaning in Compu-
    tational Linguistics (ExProM) (2016)
 5. Blanco, E., Morante, R., Saurı́, R.: Proceedings of the workshop computational
    semantics beyond events and roles. In: Proceedings of the Workshop Computational
    Semantics Beyond Events and Roles (2017)
 6. Blanco, E., Morante, R., Sporleder, C.: Proceedings of the second workshop on
    extra-propositional aspects of meaning in computational semantics (exprom 2015).
    In: Proceedings of the Second Workshop on Extra-Propositional Aspects of Mean-
    ing in Computational Semantics (ExProM 2015) (2015)
 7. Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency pars-
    ing. In: Proceedings of the tenth conference on computational natural language
    learning. pp. 149–164 (2006)
 8. Cignarella, A.T., Frenda, S., Basile, V., Bosco, C., Patti, V., Rosso, P., et al.:
    Overview of the evalita 2018 task on irony detection in italian tweets (ironita). In:
    Proceedings of the 6th evaluation campaign of Natural Language Processing and
    Speech tools for Italian (EVALITA’18). pp. 26–34 (2018), http://ceur-ws.org/Vol-
    2263/paper005.pdf
 9. Domı́nguez-Mas, L., Ronzano, F., Furlong, L.I.: Supervised Learning Approaches
    to Detect Negation Cues in Spanish Reviews. In: Proceedings of the Iberian Lan-
    guages Evaluation Forum (IberLEF 2019). CEUR Workshop Proceedings, CEUR-
    WS, Bilbao, Spain (2019)
10. Fabregat, H., Duque, A., Martı́nez-Romo, J., Araujo, L.: Extending a Deep Learn-
    ing approach for Negation Cues Detection in Spanish. In: Proceedings of the
    Iberian Languages Evaluation Forum (IberLEF 2019). CEUR Workshop Proceed-
    ings, CEUR-WS, Bilbao, Spain (2019)
11. Fabregat, H., Martı́nez-Romo, J., Araujo, L.: Deep Learning Approach for Nega-
    tion Cues Detection in Spanish at NEGES 2018. In: Proceedings of NEGES 2018:
    Workshop on Negation in Spanish, CEUR Workshop Proceedings. vol. 2174, pp.
    43–48 (2018)
12. Farkas, R., Vincze, V., Móra, G., Csirik, J., Szarvas, G.: The CoNLL-2010 shared
    task: learning to detect hedges and their scope in natural language text. In:
    Proceedings of the Fourteenth Conference on Computational Natural Language
    Learning—Shared Task. pp. 1–12 (2010)
13. Giudice, V.: Aspie96 at IronITA (EVALITA 2018): Irony Detection in Italian
    Tweets with Character-Level Convolutional RNN. In: Proceedings of the 6th eval-
    uation campaign of Natural Language Processing and Speech tools for Italian
    (EVALITA’18). pp. 160–165 (2018), http://ceur-ws.org/Vol-2263/paper026.pdf
14. Giudice, V.: Aspie96 at FACT (IberLEF 2019): Factuality Classification in Spanish
    Texts with Character-Level Convolutional RNN and Tokenization. In: Proceedings
    of the Iberian Languages Evaluation Forum (IberLEF 2019). CEUR Workshop
    Proceedings, CEUR-WS, Bilbao, Spain (2019)
15. Giudice, V.: Aspie96 at NEGES (IberLEF 2019): Negation Cues Detection in Span-
    ish with Character-Level Convolutional RNN and Tokenization. In: Proceedings of
    the Iberian Languages Evaluation Forum (IberLEF 2019). CEUR Workshop Pro-
    ceedings, CEUR-WS, Bilbao, Spain (2019)




                                           340
           Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)




16. Horn, L.R.: A natural history of negation. CSLI Publications (1989)
17. Horn, L.: The expression of negation. De Gruyter Mouton, Berlin (2010)
18. Jiménez-Zafra, S.M., Cruz Dı́az, N.P., Morante, R., Martı́n-Valdivia, M.T.a.:
    NEGES 2018: Workshop on Negation in Spanish. Procesamiento del Lenguaje Nat-
    ural (62), 21–28 (2019)
19. Jiménez-Zafra, S.M., Taulé, M., Martı́n-Valdivia, M.T., Ureña-López, L.A., Martı́,
    M.A.: SFU Review SP-NEG: a Spanish corpus annotated with negation for senti-
    ment analysis. A typology of negation patterns. Language Resources and Evalua-
    tion 52(2), 533–569 (2018)
20. Kim, J.D., Ohta, T., Pyysalo, S., Kano, Y., Tsujii, J.: Overview of bionlp’09 shared
    task on event extraction. In: Proceedings of the Workshop on Current Trends in
    Biomedical Natural Language Processing: Shared Task. pp. 1–9. Association for
    Computational Linguistics (2009)
21. Konstantinova, N., De Sousa, S.C., Dı́az, N.P.C., López, M.J.M., Taboada, M.,
    Mitkov, R.: A review corpus annotated for negation, speculation and their scope.
    In: Proceedings of LREC 2012. pp. 3190–3195 (2012)
22. Loharja, H., Padró, L., Turmo, J.: Negation Cues Detection Using CRF on Spanish
    Product Review Text at NEGES 2018. In: Proceedings of NEGES 2018: Workshop
    on Negation in Spanish, CEUR Workshop Proceedings. vol. 2174, pp. 49–54 (2018)
23. Martı́, M.A., Taulé, M., Nofre, M., Marsó, L., Martı́n-Valdivia, M.T., Jiménez-
    Zafra, S.M.: La negación en español: análisis y tipologı́a de patrones de negación.
    Procesamiento del Lenguaje Natural (57), 41–48 (2016)
24. Morante, R., Blanco, E.: * SEM 2012 shared task: Resolving the scope and focus
    of negation. In: Proceedings of the First Joint Conference on Lexical and Compu-
    tational Semantics. pp. 265–274 (2012)
25. Morante, R., Sporleder, C.: Proceedings of the workshop on negation and specula-
    tion in natural language processing. In: Proceedings of the Workshop on Negation
    and Speculation in Natural Language Processing. pp. 1–109 (2010)
26. Morante, R., Sporleder, C.: Proceedings of the workshop on extra-propositional
    aspects of meaning in computational linguistics. In: Proceedings of the Workshop
    on Extra-Propositional Aspects of Meaning in Computational Linguistics (2012)
27. Mowery, D.L., Velupillai, S., South, B.R., Christensen, L., Martinez, D., Kelly,
    L., Goeuriot, L., Elhadad, N., Pradhan, S., Savova, G., et al.: Task 2: Share/clef
    ehealth evaluation lab 2014. In: Proceedings of CLEF 2014 (2014)
28. Padró, L., Stanilovsky, E.: Freeling 3.0: Towards wider multilinguality. In: Pro-
    ceedings of LREC 2012. Istanbul, Turkey (May 2012)
29. Taboada, M., Anthony, C., Voll, K.D.: Methods for creating semantic orientation
    dictionaries. In: Proceedings of LREC 2016. pp. 427–432 (2006)
30. Uzuner, Ö., South, B.R., Shen, S., DuVall, S.L.: 2010 i2b2/va challenge on con-
    cepts, assertions, and relations in clinical text. Journal of the American Medical
    Informatics Association 18(5), 552–556 (2011)




                                           341