<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Factuality Value</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Anne-Lyse Minard</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. of Information Engineering, University of Brescia</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Fondazione Bruno Kessler</institution>
          ,
          <addr-line>Trento</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>VU Amsterdam</institution>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <abstract>
        <p>English. This report describes the FactA (Event Factuality Annotation) Task presented at the EVALITA 2016 evaluation campaign. The task aimed at evaluating systems for the identification of the factuality profiling of events. Motivations, datasets, evaluation metrics, and postevaluation results are presented and discussed.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Reasoning about events plays a fundamental role in
text understanding. It involves many aspects such
as the identification and classification of events, the
identification of event participants, the anchoring
and ordering of events in time, and their factuality
profiling.</p>
      <p>In the context of the 2016 EVALITA evaluation
campaign, we organized FactA (Event Factuality
Annotation), the first evaluation exercise for
factuality profiling of events in Italian. The task is a
follow-up of Minard et al. (2015) presented in the
track ”Towards EVALITA 2016” at CLiC-it 2015.
Factuality profiling is an important component for
the interpretation of the events in discourse.
Different inferences can be made from events which
have not happened (or whose happening is
probable) than from those which are described as
factual. Many NLP applications such as Question
Answering, Summarization, and Textual Entailment,
among others, can benefit from the availability of
this type of information.</p>
      <p>
        Factuality emerges through the interaction of
linguistic markers and constructions and its
annotation represents a challenging task. The notion of
factuality is strictly related to other research areas
throughly explored in NLP, such as subjectivity,
belief, hedging and modality
        <xref ref-type="bibr" rid="ref12 ref18 ref6">(Wiebe et al., 2004;
Prabhakaran et al., 2010; Medlock and Briscoe,
2007; Saurı et al., 2006)</xref>
        . In this work, we adopted
a notion of factuality which corresponds to the
committed belief expressed by relevant sources towards
the status of an event
        <xref ref-type="bibr" rid="ref14 ref3">(Saur´ı and Pustejovsky, 2012)</xref>
        .
In particular, the factuality profile of events is
expressed by the intersections of two axes: i.)
certainty, which expresses a continuum which ranges
from absolutely certain to uncertain; and ii.)
polarity, which defines a binary distinction: affirmed (or
positive) vs. negated (or negative).
      </p>
      <p>
        In recent years, factuality profiling has been the
focus of several evaluation exercises and shared
tasks, especially for English, both in the newswire
domain and in the biomedical domain. To mention
the most relevant:
the BioNLP 2009 Task 3 1 and BioNLP 2011
Shared 2 Task aimed at recognizing if
biomolecular events were affected by speculation
or negation;
the CoNLL 2010 Share Task 3 focused on
hedge detection, i.e. identify speculated
events, in biomedical texts;
the ACE Event Detection and Recognition
1http://www.nactem.ac.uk/tsujii/GENIA/
SharedTask/
2http://2011.bionlp-st.org
3http://rgai.inf.u-szeged.hu/index.
php?lang=en&amp;page=conll2010st
tasks 4 required systems to distinguish
between asserted and non-asserted (e.g.
hypothetical, desired, and promised) extracted
events in news articles;
the 2012 *SEM Shared Task on Resolving
The Scope of Negation 5 focused one of its
substasks on the identification of negated, i.e.
counterfactual, events;
the Event Nugget Detection task at TAC KBP
2015 Event Track 6 aimed at assessing the
performance of systems in identifying events and
their factual, or realis, value in news
        <xref ref-type="bibr" rid="ref10">(Mitamura et al., 2015)</xref>
        ;
the 2015 7 and 2016 8 SemEval Clinical
TempEval tasks required systems to assign the
factuality value (i.e. attributed modality and
polarity) to the extracted events in clinical
notes.
      </p>
      <p>Finally recent work, such as the Richer Event
Description annotation initiative,9 has extended the
annotation of factuality on temporal relations
between pairs of events or pairs of events and
temporal expressions as a specific task, independent from
the factuality of the events involved, to represent
claims about the certainty of the temporal relations
themselves.</p>
      <p>FactA provides the research community with
new benchmark datasets and an evaluation
environment to assess system performance concerning
the assignment of factuality values to events. The
evaluation is structured in two tasks: a Main Task,
which focuses on the factuality profile of events in
the newswire domain, and a Pilot Task, which
addresses the factuality profiling of events expressed
in tweets. To better evaluate system performance
on factuality profiling and avoid the impact of
errors from related subtasks, such as event
identification, we restricted the task to the assignment of
4http://itl.nist.gov/iad/mig/tests/
ace/</p>
      <p>5http://ixa2.si.ehu.es/starsem/index.
php\%3Foption=com_content&amp;view=article&amp;
id=52&amp;Itemid=60.html</p>
      <p>6http://www.nist.gov/tac/2015/KBP/
Event/index.html</p>
      <p>7http://alt.qcri.org/semeval2015/
task6/</p>
      <p>8http://alt.qcri.org/semeval2016/
task12/</p>
      <p>9https://github.com/timjogorman/
RicherEventDescription/blob/master/
guidelines.md
factuality values. Although as many as 13 teams
registered for the task, none of those teams
actually submitted any output. Nevertheless, we were
able to run an evaluation following the evaluation
campaign conditions for one system which was
developed by one of the organizers, FactPro.</p>
      <p>The remainder of the paper is organized as
follows: the evaluation exercise is described in detail
in Section 2, while the datasets are presented in
Section 3. In Section 4 we describe the evaluation
methodology and in Section 5 the results obtained
with the FactPro system are illustrated. We
conclude the paper in Section 6 with a discussion about
the task and future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Task Description</title>
      <p>Following Tonelli et al. (2014) and Minard et al.
(2014), in FactA we represent factuality by means
of three attributes associated with events,10 namely
certainty, time, and polarity. The FactA task
consisted of taking as input a text in which the textual
extent of events is given (i.e. gold standard data)
and assign to the events the correct values for the
three factuality attributes 11 according to the
relevant source. In FactA, the relevant source is either
the utterer (in direct speech, indirect speech or
reported speech) or the author of the news (in all
other cases). Systems do not have to provide the
overall factuality value (FV): this is computed
automatically on the basis of the certainty, time, and
polarity attributes (see Section 2.2 for details).
2.1</p>
      <sec id="sec-2-1">
        <title>Factuality Attributes</title>
        <p>Certainty. Certainty relates to how sure the
relevant source is about the mentioned event and
admits the following three values: certain (e.g.
‘rassegnato’ in [1]), non certain (e.g. ‘usciti’ in
[2]), and underspecified (e.g. ‘spiegazioni’
in [3]).</p>
        <p>
          1. Smith ha rassegnato ieri le dimissioni;
nomineranno il suo successore entro un mese.
(“Smith resigned yesterday; they will appoint
his replacement within a month.”)
10Based on the TimeML specifications
          <xref ref-type="bibr" rid="ref13">(Pustejovsky et al.,
2003)</xref>
          , the term event is used as a cover term for situations
that happen or occur, including predicates describing states or
circumstances in which something obtains or holds true.
        </p>
        <p>11Detailed instruction are reported in the FactA Annotation
Guidelines available at http://facta-evalita2016.
fbk.eu/documentation
2. Probabilmente i ragazzi sono usciti di casa
tra le 20 e le 21. (“The guys went probably
out between 8 and 9 p.m.”)
3. L’Unione Europea ha chiesto “spiegazioni”
sulla strage di Beslan. (“The European Union
has asked for an explanation about the
massacre of Beslan.”)
Time. Time specifies the time when an event is
reported to have taken place or is going to take
place. Its values are past/present (for
nonfuture events, e.g. ‘capito’ in [4]), future (for
events that will take place, e.g. ‘lottare’ in [4])
or ‘nomineranno’ in [1]), and underspecified
(e.g. ‘verifica’ in [5]).</p>
        <p>4. I russi hanno capito che devono lottare
insieme. (“Russians have understood that they
must fight together.”)
5. Su 542 aziende si hanno i dati definitivi
mentre per le altre 38 si e` tuttora in fase di verifica.
(“They have the final data for 542 companies
while for the other 38 it is under validation.”)
Polarity. Polarity captures whether an event is
affirmed or negated and, consequently, it can
be either positive (e.g. ‘rassegnato’ in [1])
or negative (e.g. ‘nominato’ in [6]); when
there is not enough information available to
detect the polarity of an event mention, its value is
underspecified (e.g. ‘scompone’ in [7]).
6. Non ha nominato un amministratore delegato.</p>
        <p>(“He did not appoint a CEO.”)
7. Se si scompone il dato sul nero, si vede che il
23% e` dovuto a lavoratori residenti in
provincia. (“If we analyze the data about the black
market labor, we can see that 23% is due to
workers resident in the province.”)</p>
        <p>Event mentions in texts can be used to refer to
events that do not correlate with a real situation in
the world (e.g. ‘parlare’ in [8]). For these event
mentions, participant systems are required to leave
the value of all three attributes empty.</p>
        <p>8. Guardate, penso che sia prematuro parlare
del nuovo preside. (“Well, I think it is too
early to talk about the new dean.”)
2.2
The combination of the certainty, time, and polarity
attributes described above determines the factuality
value (FV) of an event with respect to the relevant
source.</p>
        <p>As shown in Table 1, the FV can assume five
values: i.) factual; ii.) counterfactual;
iii.) non-factual; iv.) underspecified;
and v.) no factuality (no fact). We illustrate
in Table 1 the full set of valid combinations of the
attribute values and the corresponding FV.</p>
        <p>A factual value is assigned if an event has
the following configuration of attributes:
certainty: certain
time: past/present
polarity: positive</p>
        <p>For instance, the event ‘rassegnato’ [resigned]
in [1] will qualify as a factual event. On
the other hand, a change in the polarity
attribute, i.e. negative, will give rise to a
counterfactual FV, like for instance the event
‘nominato’ [appointed] in [6].</p>
        <p>Non-factual events depend on the values of
the certainty and time attributes. In particular, a
non-factual value is assigned if either of the
two cases below occur, namely:
certainty: non certain; or
time: future</p>
        <p>This is the case for the event ‘lottare’ [fight]
in [4], where time is future, or the event
‘usciti’ [went out] in [2] where certainty is
non certain.</p>
        <p>The event FV is underspecified if
at least one between certainty and time is
underspecified, independently of the
polarity value, like for instance in the case of ‘verifica’
[validation] in [5].</p>
        <p>Finally, if the three attributes have no value, FV
is no factuality (e.g. ‘parlare’ [discuss] in
[8]).
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Dataset Description</title>
      <p>
        We made available an updated version of Fact-Ita
Bank
        <xref ref-type="bibr" rid="ref7">(Minard et al., 2014)</xref>
        as training data to
participants. This consists of 169 documents selected
from the Ita-TimeBank
        <xref ref-type="bibr" rid="ref4">(Caselli et al., 2011)</xref>
        and
12The number of tokens for the pilot test is computed after
the tokenizaion, i.e. the hashtags and aliases can be split in
more than one token and the emoji are composed by several
tokens.
      </p>
      <p>Certainty
certain
certain
non cert.
any value
certain
undersp.
undersp.</p>
      <p>Time
past/pres.
past/pres.
any value</p>
      <p>future
undersp.
past/pres.
undersp.</p>
      <p>Polarity
positive
negative
any value
any value
any value
any value
any value</p>
      <p>FV
factual
counterfact.</p>
      <p>non-fact.
non-fact.
underspec.
underspec.
underspec.</p>
      <p>no fact.
first released for the EVENTI task at EVALITA
2014.13 Fact-Ita Bank contains annotations for
6,958 events (see Table 2 for more details) and is
distributed with a CC-BY-NC license.14</p>
      <p>
        As test data for the Main Task we selected the
Italian section of the NewsReader MEANTIME
corpus
        <xref ref-type="bibr" rid="ref9">(Minard et al., 2016)</xref>
        , a corpus of 120
Wikinews articles annotated at multiple levels. The
Italian section is called WItaC, the NewsReader
Wikinews Italian Corpus
        <xref ref-type="bibr" rid="ref16 ref8">(Speranza and Minard,
2015)</xref>
        , and consists of 15,676 tokens (see Table 2).
      </p>
      <p>
        As test data for the Pilot Task we annotated 301
tweets with event factuality, representing a
subsection of the test set of the EVALITA 2016
SENTIPOLC task
        <xref ref-type="bibr" rid="ref2">(Barbieri et al., 2016)</xref>
        (see Table 2).
      </p>
      <p>
        Training and test data, both for the Main and the
Pilot Tasks, are in the CAT (Content Annotation
Tool)
        <xref ref-type="bibr" rid="ref3">(Bartalesi Lenzi et al., 2012)</xref>
        labelled
format. This is an XML-based stand-off format where
different annotation layers are stored in separate
document sections and are related to each other and
to source data through pointers.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Evaluation</title>
      <p>Participation in the task consisted of providing only
the values for the three factuality attributes
(certainty, time, polarity), while the FV score was to
be computed through the FactA scorer on top of
these values.</p>
      <p>The evaluation is based on the micro-average
F1 score of the FVs, which is equivalent to the
accuracy in this task as all events should receive a
FV (i.e. the total numbers of False Positives and
False Negatives over the classes are equal). In
addition to this, an evaluation of the performance of
13https://sites.google.com/site/
eventievalita2014/home</p>
      <p>14http://hlt-nlp.fbk.eu/technologies/
fact-ita-bank
the systems on the single attributes (using
microaverage F1 score, equivalent to the accuracy) will
be provided as well. We consider this type of
evaluation to be more informative than the one based on
the single FV because it will provide evidence of
systems’ ability to identify the motivations for the
assignment of a certain factuality value. To clarify
this point, consider the case of an event with FV
non-factual (certainty non certain, time
past/present and polarity positive). A
system might correctly identify that the FV of
the event is non-factual because certainty is
non certain, or erroneously identify that time
is future.
5</p>
    </sec>
    <sec id="sec-5">
      <title>System Results</title>
      <p>Unfortunately no participants took part in the FactA
task. However, we managed to run an evaluation
test with a system for event factuality annotation in
Italian, FactPro, developed by one of the organizers
and respecting the evaluation campaign conditions.
The system was evaluated against both gold
standard, i.e. the Main and Pilot tasks. In this section
we describe this system and the results obtained on
the FactA task.
5.1</p>
      <sec id="sec-5-1">
        <title>FactPro module</title>
        <p>
          FactPro is a module of the TextPro NLP pipeline 15
          <xref ref-type="bibr" rid="ref11">(Pianta et al., 2008)</xref>
          . It has been developed by
Anne-Lyse Minard in collaboration with Federico
Nanni as part of an internship.
        </p>
        <p>
          Event Factuality annotation is performed in
FactPro in three steps: (1) detection of the polarity of an
event, (2) identification of the certainty of an event
and (3) identification of the semantic time. These
three steps are based on a machine learning
approach, using Support Vector Machines algorithm,
and are taken as text chunking tasks in which events
have to be classified in different classes. For each
step a multi-class classification model is built using
the text chunker Yamcha
          <xref ref-type="bibr" rid="ref5">(Kudo and Matsumoto,
2003)</xref>
          .
        </p>
        <p>FactPro requires the following pre-processes:
sentence splitting, tokenization, morphological
analysis, lemmatization, PoS tagging, chunking,
and event detection and classification. As the data
provided for FactA consist of texts already split into
sentences, tokenized and annotated with events, the
steps of sentence splitting, tokenization and event
15http://textpro.fbk.eu
detection and classification are not performed for
these experiments.</p>
        <p>Each classifier makes use of different features:
lexical, syntactic and semantic. They are described
in the remainder of the section. For the detection
of polarity and certainty, FactPro makes use of
trigger lists which have been built manually using
the training corpus.</p>
        <p>Polarity features:
– For all tokens: token’s lemma, PoS tags,
whether it is a polarity trigger (list manually
built);
– If the token is part of an event: presence of
polarity triggers before it, their number, the
distance to the closest trigger, and whether
the event is part of a conditional
construction;
– The polarity value tagged by the classifier
for the two preceding tokens.</p>
        <p>Certainty features:
– For all tokens: token’s lemma, flat
constituent (noun phrase or verbal phrase),
whether it is a modal verb, whether it is
a certainty trigger (list manually built);
– If the token is part of an event: the event
class (It-TimeML classes), presence of a
modal before and its value, and whether the
event is part of a conditional construction;
– The certainty value tagged by the classifier
for the two preceding tokens.</p>
        <p>Time features:
– For all tokens: token’s lemma and whether
it is a preposition;
– If the token is part of an event: tense and
mood of the verb before, presence of a
preposition before, event’s polarity and
certainty;
– If the token is a verb: its tense and mood;
– The time value tagged by the classifier for
the three preceding tokens.</p>
        <p>Each token is represented using these features as
well as some of the features of the previous tokens
and of the following ones. We have defined the
set of features used by each classifier performing
several evaluations on a subsection of the Fact-Ita
Bank corpus.
task
main
main
pilot
pilot
system
baseline
FactPro
baseline
FactPro
We can observe from Table 3 that FactPro performs
better for the detection of polarity and certainty
than for the identification of time. One reason
is the predominance of one value for the
polarity and certainty attributes, and of two values for
time. For example, in the training corpus, 94% of
the events have a polarity positive and 86%
are certain, whereas 71% of the events are
past/present and 22% are future.</p>
        <p>An extensive error analysis on the output of the
systems for the three attributes was conducted. As
for the polarity attribute, the error analysis showed
that the system’s failure to detect negated events is
not mainly due to a sparseness of negated events in
the training data, but it mainly concerns the
negation scope, whereas when the system missed a
negative event it was mainly due to the incompleteness
of the trigger lists (e.g. mancata in dopo la
mancata approvazione is a good trigger for polarity
negative but it is absent from the trigger list).</p>
        <p>The detection of non certain events works
well when the event is preceded by a verb at the
conditional and when it is part of an infinitive clause
introduced by per. However when the uncertainty of
an event is expressed by the semantics of previous
words (e.g. salvataggio in il piano di salvataggio)
the system makes errors.</p>
        <p>With respect to the annotation of future
events, the observations are similar to those for
non certain events. Indeed, future events are
well recognized by the system when they are part
of an infinitive clause introduced by the preposition
per as well as when their tense is future.</p>
        <p>Finally, we observed that FactPro makes a lot
of errors when the annotation of the factuality of
nominal events is concerned. In the Main Task it
correctly identified the FV of 81% of the verbal
events and only 61% of the nominal events.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and Future Work</title>
      <p>The lack of participants in the task limits the
discussion of the results to the in-house developed system.
The main reason for the lack of participation to
FactA, according to the outcome of a questionnaire
organized by the 2016 EVALITA chairs, was that
the participants gave priority to other EVALITA
tasks. However, FactA achieves two main results:
i.) setting state-of-the-art results for the
factuality profiling of events in two text types in Italian,
namely news articles and tweets; and ii.) making
available to the community a new benchmark
corpus and standardized evaluation environment for
comparing systems’ performance and facilitating
replicability of results.</p>
      <p>
        The test data used for the Main Task consists
of the Italian section of the MEANTIME corpus
        <xref ref-type="bibr" rid="ref9">(Minard et al., 2016)</xref>
        . MEANTIME contains the
same documents aligned in English, Italian,
Spanish and Dutch, thus making available a multilingual
environment for cross-language evaluation of the
factuality profiling of events. Furthermore, within
the NewsReader project, a module for event
factuality annotation has been implemented and evaluated
against the English section of the MEANTIME
task
main
pilot
corpus
        <xref ref-type="bibr" rid="ref1">(Agerri et al., 2015)</xref>
        . The evaluation was
performed in a different way than in FactA, in
particular no gold events were provided as input to the
system, so the evaluation of factuality was done
only for the events correctly identified by the event
detection module. The system obtained an
accuracy of 0.88, 0.86 and 0.59 for polarity, certainty,
and time, respectively.
      </p>
      <p>The Pilot task was aimed at evaluating how well
systems built for standard language perform on
social media texts, and at making available a set of
tweets annotated with event mentions (following
TimeML definition of events) and their factuality
value. The pilot data are shared between three other
tasks of EVALITA 2016 (PoSTWITA, NEEL-IT
and SENTIPOLC), which contributed to the
creation of a richly annotated corpus of tweets to be
used for future cross-fertilization tasks. Finally, the
annotation of tweets raised new issues for factuality
annotation because tweets contain a lot of
imperatives and interrogatives that are generally absent
from news and for which the factuality status is not
obvious (e.g. Ordini solo quello che ti serve).</p>
      <p>The results obtained by FactPro, as reported in
Table 3 and Table 4, show that i.) the system is
able to predict with pretty high accuracy the FV
on events in the news domain and with a lower but
good score the factuality of events in tweets; ii.)
the difference in performance between the news
and tweet text types suggest that specific training
set data may be required to address the
peculiarities of tweets’ language; iii.) the F1 scores for
the certainty, polarity and time attributes clearly
indicate areas of improvements and also contribute
to a better understanding of the system’s results;
iv.) the F1 scores on the FV suggest that extending
the training data with tweets could also benefit the
identification of values which are not frequent in
the news domain, such as no fact.</p>
      <p>Future work will aim at re-running the task from
raw text and developing specific modules for the
factuality of events according to the text types
where they occur. Finally, we will plan to run a
cross-fertilization task concerning temporal
ordering and anchoring of events and factauality
profiling.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work has been partially supported by the
EU NewsReader Project (FP7-ICT-2011-8 grant
316404) and the NWO Spinoza Prize project
Understanding Language by Machines (sub-track 3).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Rodrigo</given-names>
            <surname>Agerri</surname>
          </string-name>
          , Itziar Aldabe, Zuhaitz Beloki, Egoitz Laparra, German Rigau, Aitor Soroa, Marieke van Erp,
          <string-name>
            <surname>Antske Fokkens</surname>
          </string-name>
          , Filip Ilievski, Ruben Izquierdo, Roser Morante, Chantal van Son,
          <string-name>
            <surname>Piek Vossen</surname>
          </string-name>
          , and
          <string-name>
            <surname>Anne-Lyse Minard</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Event Detection, version 3</article-title>
          .
          <source>Technical Report D4-2-3</source>
          ,
          <string-name>
            <given-names>VU</given-names>
            <surname>Amsterdam</surname>
          </string-name>
          . http://kyoto.let.vu.nl/newsreader_ deliverables/NWR-D4-
          <article-title>2-3</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Francesco</given-names>
            <surname>Barbieri</surname>
          </string-name>
          , Valerio Basile, Danilo Croce, Malvina Nissim, Nicole Novielli, and
          <string-name>
            <given-names>Viviana</given-names>
            <surname>Patti</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Overview of the EVALITA 2016 SENTiment POLarity Classification Task</article-title>
          . In Pierpaolo Basile, Anna Corazza, Franco Cutugno, Simonetta Montemagni, Malvina Nissim, Viviana Patti, Giovanni Semeraro, and Rachele Sprugnoli, editors,
          <source>Proceedings of Third Italian Conference on Computational Linguistics</source>
          (CLiC-it
          <year>2016</year>
          ) &amp;
          <article-title>Fifth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</article-title>
          .
          <source>Final Workshop (EVALITA</source>
          <year>2016</year>
          ).
          <article-title>Associazione Italiana di Linguistica Computazionale (AILC).</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Valentina</given-names>
            <surname>Bartalesi</surname>
          </string-name>
          <string-name>
            <surname>Lenzi</surname>
          </string-name>
          , Giovanni Moretti, and
          <string-name>
            <given-names>Rachele</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>CAT: the CELCT Annotation Tool</article-title>
          .
          <source>In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12)</source>
          , pages
          <fpage>333</fpage>
          -
          <lpage>338</lpage>
          , Istanbul, Turkey, May.
          <source>European Language Resources Association (ELRA).</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Tommaso</given-names>
            <surname>Caselli</surname>
          </string-name>
          , Valentina Bartalesi Lenzi, Rachele Sprugnoli, Emanuele Pianta, and
          <string-name>
            <given-names>Irina</given-names>
            <surname>Prodanof</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Annotating Events, Temporal Expressions and Relations in Italian: the It-TimeML Experience for the Ita-TimeBank</article-title>
          .
          <source>In Linguistic Annotation Workshop</source>
          , pages
          <fpage>143</fpage>
          -
          <lpage>151</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Taku</given-names>
            <surname>Kudo</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yuji</given-names>
            <surname>Matsumoto</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Fast Methods for Kernel-based Text Analysis</article-title>
          .
          <source>In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ACL '03</source>
          , pages
          <fpage>24</fpage>
          -
          <lpage>31</lpage>
          , Stroudsburg, PA, USA.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Ben</given-names>
            <surname>Medlock</surname>
          </string-name>
          and
          <string-name>
            <given-names>Ted</given-names>
            <surname>Briscoe</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Weakly supervised learning for hedge classification in scientific literature</article-title>
          .
          <source>In ACL</source>
          , volume
          <year>2007</year>
          , pages
          <fpage>992</fpage>
          -
          <lpage>999</lpage>
          . Citeseer.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Anne-Lyse</surname>
            <given-names>Minard</given-names>
          </string-name>
          , Alessandro Marchetti, and
          <string-name>
            <given-names>Manuela</given-names>
            <surname>Speranza</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Event Factuality in Italian: Annotation of News Stories from the ItaTimeBank</article-title>
          .
          <source>In Proceedings of CLiC-it</source>
          <year>2014</year>
          , First Italian Conference on Computational Linguistic.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Anne-Lyse</surname>
            <given-names>Minard</given-names>
          </string-name>
          , Manuela Speranza, Rachele Sprugnoli, and
          <string-name>
            <given-names>Tommaso</given-names>
            <surname>Caselli</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>FacTA: Evaluation of Event Factuality and Temporal Anchoring</article-title>
          .
          <source>In Proceedings of the Second Italian Conference on Computational Linguistics</source>
          CLiC-it
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Anne-Lyse</surname>
            <given-names>Minard</given-names>
          </string-name>
          , Manuela Speranza, Ruben Urizar, Begoa Altuna, Marieke van Erp,
          <string-name>
            <surname>Anneleen Schoen</surname>
          </string-name>
          , and Chantal van Son.
          <year>2016</year>
          .
          <article-title>Meantime, the newsreader multilingual event and time corpus</article-title>
          .
          <source>In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC</source>
          <year>2016</year>
          ), Paris, France, may.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Teruko</given-names>
            <surname>Mitamura</surname>
          </string-name>
          , Yukari Yamakawa, Susan Holm, Zhiyi Song, Ann Bies, Seth Kulick, and
          <string-name>
            <given-names>Stephanie</given-names>
            <surname>Strassel</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Event Nugget Annotation: Processes and Issues</article-title>
          .
          <source>In Proceedings of the 3rd Workshop on EVENTS at the NAACL-HLT</source>
          , pages
          <fpage>66</fpage>
          -
          <lpage>76</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Emanuele</given-names>
            <surname>Pianta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Christian</given-names>
            <surname>Girardi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Roberto</given-names>
            <surname>Zanoli</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>The TextPro Tool Suite</article-title>
          .
          <source>In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC</source>
          <year>2008</year>
          ), Marrakech, Morocco.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Vinodkumar</given-names>
            <surname>Prabhakaran</surname>
          </string-name>
          , Owen Rambow, and
          <string-name>
            <given-names>Mona</given-names>
            <surname>Diab</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Automatic committed belief tagging</article-title>
          .
          <source>In Proceedings of the 23rd International Conference on Computational Linguistics: Posters</source>
          , pages
          <fpage>1014</fpage>
          -
          <lpage>1022</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>James</given-names>
            <surname>Pustejovsky</surname>
          </string-name>
          , Jose´ M. Castan˜o, Robert Ingria, Roser Saur´ı, Robert J.
          <string-name>
            <surname>Gaizauskas</surname>
          </string-name>
          , Andrea Setzer, Graham Katz, and
          <string-name>
            <surname>Dragomir</surname>
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Radev</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>TimeML: Robust Specification of Event and Temporal Expressions in Text</article-title>
          . In New Directions in Question Answering, pages
          <fpage>28</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <source>Roser Saur´ı and James Pustejovsky</source>
          .
          <year>2012</year>
          .
          <article-title>Are you sure that this happened? Assessing the factuality degree of events in text</article-title>
          .
          <source>Computational Linguistics</source>
          ,
          <volume>38</volume>
          (
          <issue>2</issue>
          ):
          <fpage>261</fpage>
          -
          <lpage>299</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Roser</given-names>
            <surname>Saurı</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Marc</given-names>
            <surname>Verhagen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>James</given-names>
            <surname>Pustejovsky</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Annotating and recognizing event modality in text</article-title>
          .
          <source>In Proceedings of 19th International FLAIRS Conference.</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Manuela</given-names>
            <surname>Speranza</surname>
          </string-name>
          and
          <string-name>
            <surname>Anne-Lyse Minard</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Cross-language projection of multilayer semantic annotation in the NewsReader Wikinews Italian Corpus (WItaC)</article-title>
          .
          <source>In Proceedings of CLiC-it</source>
          <year>2015</year>
          , Second Italian Conference on Computational Linguistic.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Sara</given-names>
            <surname>Tonelli</surname>
          </string-name>
          , Rachele Sprugnoli, Manuela Speranza, and
          <string-name>
            <surname>Anne-Lyse Minard</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>NewsReader Guidelines for Annotation at Document Level</article-title>
          .
          <source>Technical Report NWR2014-2-2</source>
          , Fondazione Bruno Kessler. http://www.newsreader-project.eu/ files/2014/12/NWR-2014-2
          <article-title>-2</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Janyce</given-names>
            <surname>Wiebe</surname>
          </string-name>
          , Theresa Wilson, Rebecca Bruce, Matthew Bell, and
          <string-name>
            <given-names>Melanie</given-names>
            <surname>Martin</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Learning subjective language</article-title>
          .
          <source>Computational linguistics</source>
          ,
          <volume>30</volume>
          (
          <issue>3</issue>
          ):
          <fpage>277</fpage>
          -
          <lpage>308</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>