Enrichring the Ita-TimeBank with Narrative Containers Alice Bracchi Tommaso Caselli Irina Prodanof Università degli Studi di Pavia Vrije Universiteit Amsterdam Università degli Studi di Pavia C.so Strada Nuova 65 De Boelelaan 1105 C.so Strada Nuova 65 27100 Pavia 1081 HV Amsterdam 27100 Pavia alice.bracchi@gmail.com t.caselli@vu.nl irina.prodanof@gmail.com Abstract 2015)), and EVENTI (Caselli et al., 2014)). This has established best practices, common evaluation English. This paper reports on an annota- frameworks, international standards (e.g. ISO- tion experiment to enrich an existing tem- TimeML (Pustejovsky et al., 2010)), and ap- porally annotated corpus of Italian news proaches to solve such a complex task. How- articles with Narrative Containers, anno- ever, the expression of time in text/discourse is tation devices representing temporal win- by no means obvious and the automatic extraction dows in text and marking up very informa- of timelines is not a solved task yet. One of the tive temporal relations between temporal limits of current annotation frameworks and cor- entities. The annotation has shown that the pora relies mainly in the sparseness of the avail- distribution of Narrative Containers is sen- able temporal relations and in the fine-grained val- sitive to the text genre and may be used to ues used to classify the temporal links. For in- facilitate the creation of informative time- stance, in the TempEval-3 corpus the ratio be- lines. tween temporal relations and event plus tempo- Italiano. Questo lavoro illustra i risul- ral expressions is 0.8 (Bethard et al., 2014) for tati di un esperimento di annotazione per 13 temporal values. In the EVENTI corpus, the l’identificazione di Contenitori Narrativi, ratio is even smaller, only 0.19 for 13 temporal ovvero marcatori di “finestre” temporali values. 1 Furthermore, in some cases annotation in un testo, come strategia per arric- guidelines are not informative enough concerning chire un corpus di articoli di quotidiano what types of temporal links to annotate, or they in lingua italiana, già annotato con in- force the annotation of temporal relations between formazioni temporali. L’annotazione ha pairs of events when they should not be annotated. mostrato che la distribuzione dei Conteni- Attempts to overcome these limits have focused tori Narrativi è legata al genere testuale e on three main strategies: i.) annotating particu- può essere usata per facilitare la creazione lar sets of temporal relations (Kolomiyets et al., di linee temporali di eventi più informa- 2012); ii.) elaborating detailed annotation guide- tive. lines for each kind of temporal relations (event- temporal expression pairs, event-event pairs, and temporal expression-temporal expressions pairs); 1 Introduction and iii.) developing densely connected temporal graphs, where all valid relations among the tem- Research in Temporal Processing has seen an in- poral entities (events and temporal expressions) creasing interest thanks to the availability of an- are marked up, including inferred relations based notation schemes and corpora in multiple lan- on transitive properties of the temporal relations guages (Pustejovsky et al., 2003; Bittar et al., (e.g. if event A is BEFORE event B and event 2011; Caselli et al., 2011; Saurı and Badia, B IS INCLUDED in event C, then event A is 2012), and the organization of evaluation cam- BEFORE event C) (Bethard et al., 2014). We paigns (TempEval (Verhagen et al., 2007; Verha- gen et al., 2010; UzZaman et al., 2013), Clin- 1 ical TempEval (Bethard et al., 2015; Bethard The smaller ratio for the Italian data is also due to spe- cific restrictions on the annotation of the temporal relations et al., 2016), Cross-Document TimeLine (Mi- as reported in the EVENTI Annotation Guidelines and ex- nard et al., 2015), Temporal QA (Llorens et al., plained in Section 2. consider these solution as partial as they are not The TIMEX3 tag is used for the annotation able to address the issue of identifying and ex- of temporal expressions (timexes), expressing the tracting informative timelines, i.e. a set of max- type, the value and whether the timex is abso- imally informative temporal links where relevant lute or relative (e.g. “2015-05-18” vs. “yester- events in a text/discourse are correctly anchored day”[ieri]). to time, and then chronologically ordered. This The SIGNAL tag is employed to mark any lin- paper reports on the first annotation effort to en- guistic elements, such as prepositions (e.g. in rich existing resources for Temporal Processing [in]), adverbs (e.g. before [prima]), or conjunc- in Italian by adopting a document-level approach tions (e.g. when [quando]), which support the rather than a sentence-level one. Following the identification and classification of a temporal re- proposal of Narrative Containers (NCs) (Puste- lation between target entities (e.g. events and jovsky and Stubbs, 2011), as embedding intervals timexes). where events occur, we developed an annotation Finally, the TLINK tag is used to annotate scheme for their identification on the EVENTI temporal relations. In the EVENTI task, the corpus (Caselli et al., 2014) 2 , as a strategy to in- subset of possible temporal relations has been crease the informativeness of the existing anno- restricted to three subtypes of intra-sentence tations and, possibly, improve systems’s temporal relations, namely: i.) pairs of syntactic main awareness. events in the same sentence; ii.) pairs of syntactic The remainder of this paper is structured as fol- main event and subordinate event in the same lows: the EVENTI corpus will be shortly intro- sentence; and iii.) pairs of event and timexes. All duced in Section 2, with a particular emphasis on 13 temporal relation values from It-TimeML (BE- the available temporal relations. Section 3 will FORE, AFTER, IS INCLUDED, INCLUDES, present the notion of Narrative Container and the SIMULTANEOUS, I(MMEDIATELY) AFTER, proposed annotation scheme. In Section 4 the re- I(MMEDIATELY) BEFORE, IDENTITY, MEA- sults of a pilot annotation on the EVENTI dataset SURE, BEGINS, ENDS, BEGUN BY and will be reported. Finally, conclusion, future work, ENDED BY) have been used. and a pointer to the annotated data and guidelines The Main task datasets, which have been will be reported in Section 5. enriched with Narrative Containers, add up to 130,279 tokens, divided into 103,593 tokens for 2 Temporal Relations in the EVENTI training and 26,686 for test. They contain 21,633 Corpus EVENTs (17,835 in training and 3,798 in test), The EVENTI corpus, released in the context of the 3,359 TIMEX3 (2,753 in training and 624 in test), EVALITA 20143 workshop, consists of 3 datasets: 1,163 SIGNALs (923 in training and 231 in test), the Main task training data, the Main task test and 4,561 TLINKs (3,500 in training and 1,061 in data, and the Pilot task test data. The corpus test). has been annotated with a simplified version of the It-TimeML Annotation Guidelines (Caselli et 3 Adding Narrative Containers to News al., 2011), an adapted version to Italian of the Articles TimeML Guidelines. Four tags have been used to annotate the data: EVENT, TIMEX3, SIGNAL, The notion of Narrative Container (NC) was first and TLINK. introduced by Pustejovsky and Stubbs (2011) to The EVENT tag is used to annotate all lexical deal with some aspects of Temporal Processing, items which may realize an event mention. It in- such as sensitivity to the text genre and interac- cludes verbs, nouns, adjectives, and prepositional tion with discourse relations, not addressed in the phrases. The tag is enriched with 8 attributes TimeML Guidelines nor in the TimeBank corpus. expressing tense, (grammatical) aspect, part-of- NCs were proposed as a temporal window, pro- speech, mood, modality, verb form, TimeML viding left and right boundaries, to when events class, and polarity. not anchored to timexes could have happened, thus overcoming issues related to linking of events with 2 https://sites.google.com/site/ the Document Creation Time (DCT), i.e. when a eventievalita2014/ 3 http://www.evalita.it/2014/tasks/ text was written or published. In particular, stan- eventi dard TimeML markup imposes that all events have a link with the DCT but fail to specify that each • Temporal Containers (TCs): they corre- event should also be annotating to its actual tem- spond to the timexes in the text which clearly poral anchor, i.e. to its moment of occurrence. As anchor the events in analysis on a timeline; reported in Pustejovsky and Stubbs (2011), in ex- the relation can hold both at intra- and inter- ample 1, TimeML guidelines will order both event sentence level. Example 2 from our anno- mentions, e1 and e2 , to the DCT with a BEFORE tated corpus shows a timex (2001) and the relation, anchor e1 to the timex “yesterday” (t) but events it anchors (e1–e4): will fail to provide the anchoring of e2 : 2. [...] la Sonata composta[e1] nel 2001[TCanchor] , il cui primo esecu- 1. The bomb explodede1 yesterdayt2011−09−09 tore fu[e2] lo stesso Lucchesini. In and killede2 three people. [DCT=2011-09- questa esecuzione[e3] si ritrovavano[e4] 10] già tutte le doti musicali di Lucchesini [...]. A further justification to the introduction of NCs is related to the different informational status of • Event Containers (ECs): they correspond temporal relations. Assuming the informativeness to event mentions which function as a tem- of a temporal link as a function of the information poral anchor for other event mentions. ECs contained in the individual links and their closure, can be useful in cases where no anchoring an anchoring relation, that is a relation between timex is available or to model event-subevent a timex and an event explicitly stating when the relations. Example 3 shows a sentence with event occurred as the one between e1 and timex no explicit temporal expression, where the “yesterday” in example 1 (i.e. a temporal value of anchoring of events (e1–e3) is possible only INCLUDES or IS INCLUDED), is assumed to be with respect to the event (ricognizione). more informative than an ordering relations, i.e. a 3. [...] Durante la ricognizione[ECanchor] , precedence relation between two events. il tenente ha dato disposizioni[e1] per To the best of our knowledge, the only corpus il presidio, e nella fase[e2] iniziale ha which extensively adopts the notion of NC and has ordinato[e3] ai sottoposti di fare rap- available annotated data is the THYME corpus of porto al campo base. clinical narratives (Styler IV et al., 2014). Our task Figure 1 serves as a visual representation of the is the first attempt at tackling temporal contain- NC as annotated in example 2. By means of NCs, ment annotation over news articles in Italian. a document timeline will result in an ordered suc- A NC enables an accurate reproduction of the cession of NCs rather than of isolated events. This way events in text cluster around temporal ref- is the NC resulting from the following sentence, erence points, explicitly or implicitly realized in taken from the annotated corpus. the document, as the narration unfolds. NC re- lations are thus anchoring relations between pairs of events or events and temporal expressions. They are marked with an additional link tag, i.e. CONTAINS, to distinguish them from standard TLINKs. Each NC relation admits two compo- nents: i.) the narrative anchor, i.e. an element pointing to a specific temporal dimension shared by other events or timexes within the text; and ii.) the anchored element(s), i.e. events which satisfy the anchorability requirements (see Section 3.1 for details) and participate in an NC relation. Timex anchors are chosen on a transparency basis Figure 1: Visual representation of a NC for the (i.e. granularity and nature of the timex), whereas sentence in Example 2. Event anchors are chosen according to their rele- vance and salience for the timeline. Naturally, the NC represented here is only a Two sub-types of NCs can be identified: visual aid picturing the conceptual outcome of applying CONTAINS relations between the an- General EVENTI-NC statistics chor (here, the TIMEX3 2001) and anchored el- Annotated tokens 24.259 ements (here, EVENTs composta, fu, esecuzione, Annotated articles 58 and ritrovavano) EVENT markables 3645 3.1 Event Anchorability Requirements TIMEX3 markables 595 The set of events which can be anchored has been restricted to factual events. The identification of Table 1: Overview of corpus statistics. eligible anchorable events has been manually con- ducted at this stage of the annotation. We adopted Annotated NCs the definition of factuality as proposed in the Fact- Type Number % Bank (Saurı́, 2008) and which is based on the dou- ble axis of polarity (positive vs. negative) and ECs certainty. For the sake of our annotation task, Verbal anchors 61 19.5 only positive and certain events can be anchored. Nominal anchors 55 17.6 Events in the future were generally not annotated Total EC n. 116 37.1 as they normally do not have a certain status. However, those events with an established sched- TCs ule (e.g. deadlines, meetings), or whose future Text-consuming TIMEX3s 160 51.1 temporal window is assumed to be certain, such as Empty TIMEX3s 37 11.8 festivities, have been annotated in anchoring rela- Total TC n. 197 62.9 tions as well. Total NC n. 313 We excluded all events which are presented as subjective (i.e. judgements, opinions). In ex- Table 2: Distribution of Narrative Containers in ample 4, esplosione is a factual event and was the corpus. anchored as such, whereas sbagliato describes it through the grid of the writer’s judgement, who states that the explosion was a mistake, and thus It is interesting to notice that 11.8% of TCs is re- not anchored. alized by empty TIMEX3s, i.e. temporal expres- sions which do not correspond to lexical items but 4. L’esplosionee1 è avvenuta a mezzanotte di can be inferred and which are necessary to for as- lunedı̀ [...]. Insomma, gli attentatori hanno signing a correct value to a timex. sbagliatoe2 obiettivo. Finally, generic events, i.e. events which ac- 4.1 Distribution of Narrative Containers quire some kind of attributive value towards dis- anchors course participants, expressing persistent proper- We conducted an in-depth analysis of the NC an- ties or reiterated, habitual activities, were not an- chors following two parameters: i.) the properties chored. of NC anchors on their own; and ii.) the sensitiv- ity to the document genre, i.e. the news domain, 4 The EVENTI-NC Corpus on the line of Pustejovsky and Stubbs (2011). The EVENTI-NC corpus includes documents Concerning the first parameter, we first investi- from both the training and the test sections of the gated the incidence of verbal anchors as opposed Main task of the EVENTI corpus. It includes 58 to nominal anchors. Whereas there appear to be no annotated articles, for a total of 24.259 tokens, tendency towards verb or nouns being more likely covering roughly 11% of the EVENTI corpus; Ta- to anchor other events, it is interesting to take a ble 1 shows the number of EVENTs and TIMEX3 look within these categories. Out of all the ver- involved in our annotation. bal anchors, 42.9% are reporting verbs or verbs Table 2 reports the number of annotated con- employed in a declarative context. We observed tainers in our corpus, and their distribution accord- that there is a preference for ECs to correspond to ing to their type. TCs make up for almost 63% of the the event with the highest degree of topicality the total number of NCs, against the 37% of ECs. in the article, or the most important event (climax event). For example, one article4 reports on Pres- ing from a single document to a cross-document ident Clinton’s surgery in 2004: the largest EC in task. the document is anchored by intervento (surgery), Future work will aim at assessing the reliabil- with a total of 12 anchored items. ity of the proposed scheme via an inter-annotator Sensitivity to text genre can be easily observed agreement study and at completing the annotation with TC anchors. 25% of them anchor events in of the entire EVENTI corpus. Finally, the anno- a timespan that can be measured as ±1 day with tated data and guidelines are publicly available 5 respect to the DCT. Anchors for these containers to encourage additional testing and experiments. are mostly represented by non absolute temporal expressions, such as temporal adverbs (e.g. “ieri” [yesterday], “domani” [tomorrow], among others) References and by the DCT itself, which represents 11% of Steven Bethard, Nathaniel Chambers, Bill McDowell, the TC anchors. and Taylor Cassidy. 2014. An annotation frame- Genre-sensitivity might also be the factor be- work for dense event ordering. In 52nd Annual hind the average number of NCs in the corpus. Meeting of the Association for Computational Lin- guistics, 52, Baltimore, MD, USA, June. Associa- The documents have an average of 5.17 NCs, and tion for Computational Linguistics. even for more lengthy articles, the textual anchors were rarely more than 7. The average of 5.17 Steven Bethard, Guergana Savova, James Pustejovsky, NCs/article might be due to the fact that newspa- and Mark Verhagen. 2015. SemEval-2015 Task 6: Clinical TempEval. In Proceedings of the 9th In- per articles usually refer to a limited number of ternational Workshop on Semantic Evaluation (Se- facts, whose core is usually made of a handful of mEval 2015), pages 806–814. Association for Com- recent happenings; whereas the fluctuating rela- putational Linguistics. tionship between length and NC number usually depends both on the content of the article and on Steven Bethard, Savova Guergana, Leon Derczynski, Wei-Te Chen, James Pustejovski, and Mark Verha- the granularity of the selected NC. gen. 2016. SemEval-2010 Task 13: TempEval- 2. In Proceedings of SemEval-2016, SemEval ’13, 5 Conclusion pages 977–987, Stroudsburg, PA, USA. Association for Computational Linguistics. This paper reports on a first proposal of an annota- tion scheme and accompanying annotated data for André Bittar, Pascal Amsili, Pascal Denis, and Lau- NCs in Italian news articles. The NC annotation is rence Danlos. 2011. French Timebank: an ISO- TimeML annotated reference corpus. In Proceed- an additional layer on top of already available data ings of the 49th Annual Meeting of the Association for Temporal Processing in Italian. It addresses for Computational Linguistics: Human Language pending issues (e.g. the annotation of the tem- Technologies: short papers-Volume 2, pages 130– poral relations between event and the Document 134. Association for Computational Linguistics. Creation Time) and increases the informativeness Tommaso Caselli, Valentina Bartalesi Lenzi, Rachele of the document timelines. Overall, we observed Sprugnoli, Emanuele Pianta, and Irina Prodanof. that there is a preference for NC to be realized by 2011. Annotating Events, Temporal Expressions timexes in a limited time span. However, NCs may and Relations in Italian: the It-TimeML Experience also be realised by events. In this case, nouns and for the Ita-TimeBank. In Proceedings of the Fifth Law Workshop (LAW V), pages pages 143–151, Port- verbs have a similar distribution with a preference land, Oregon. Association for Computational Lin- for events which have a central role in the news or guistics. facilitate the clustering of the information (e.g. re- porting events). Such a behaviour is different with Tommaso Caselli, Rachele Sprugnoli, Manuela Sper- anza, and Monica Monachini. 2014. EVENTI: respect to clinical narratives where nominal events EValuation of Events and Temporal INformation at are more frequently selected as NC (Bethard et Evalita 2014. In Proceedings of the First Italian al., 2016). This suggests that different text gen- Conference on Computational Linguistics CLiC-it res present different ways of organizing events on 2014 & and of the Fourth International Workshop a timeline. The introduction of the factuality pa- EVALITA 2014. Pisa University Press. rameter to select the anchoring events is a strategy 5 https://sites.google.com/ to clean timelines and to move Temporal Process- site/ittimeml/documents/ narrative-container-data.zip? 4 adige20040709 id405401.txt attredirects=0&d=1 Oleksandr Kolomiyets, Steven Bethard, and Marie- Relation Identification. In Proceedings of the 4th In- Francine Moens. 2012. Annotating narrative time- ternational Workshop on Semantic Evaluations, Se- lines as temporal dependency structures. In Pro- mEval ’07, pages 75–80, Stroudsburg, PA, USA. As- ceedings of the International Conference on Lin- sociation for Computational Linguistics. guistic Resources and Evaluation. Marc Verhagen, Roser Saurı́, Tommaso Caselli, and Hector Llorens, Nathanael Chambers, Naushad Uz- James Pustejovsky. 2010. Semeval-2010 task 13: Zaman, Nasrin Mostafazadeh, James Allen, and Tempeval-2. In Proceedings of the 5th International James Pustejovsky. 2015. SemEval-2015 Task Workshop on Semantic Evaluation, SemEval ’10, 5: QA TempEval—Evaluating temporal information pages 57–62, Stroudsburg, PA, USA. Association understanding with question answering. In Proceed- for Computational Linguistics. ings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 792–800. Anne-Lyse Minard, Manuela Speranza, Eneko Agirre, Itziar Aldabe, Marieke van Erp, Bernardo Magnini, German Rigau, Ruben Urizar, and Fon- dazione Bruno Kessler. 2015. Semeval-2015 task 4: Timeline: Cross-document event ordering. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 778–786. James Pustejovsky and Amber Stubbs. 2011. Increas- ing informativeness in temporal annotation. In Pro- ceedings of the 5th Linguistic Annotation Workshop, pages 152–160. Association for Computational Lin- guistics. James Pustejovsky, Patrick Hanks, Roser Sauri, An- drew See, Robert Gaizauskas, Andrea Setzer, Dragomir Radev, Beth Sundheim, David Day, Lisa Ferro, et al. 2003. The TimeBank corpus. In Cor- pus linguistics, volume 2003, page 40. James Pustejovsky, Kiyong Lee, Harry Bunt, and Lau- rent Romary. 2010. ISO-TimeML: An international standard for semantic annotation. In Seventh In- ternational Conference on Language Resources and Evaluation (LREC’10), Valletta, Malta. Roser Saurı and Toni Badia. 2012. Spanish TimeBank 1.0. LDC catalog ref. LDC2012T12. Roser Saurı́. 2008. A Factuality Profiler for Eventual- ities in Text. Ph.D. thesis, Brandeis University. William F Styler IV, Steven Bethard, Sean Finan, Martha Palmer, Sameer Pradhan, Piet C de Groen, Brad Erickson, Timothy Miller, Chen Lin, Guergana Savova, et al. 2014. Temporal annotation in the clinical domain. Transactions of the Association for Computational Linguistics, 2:143–154. Naushad UzZaman, Hector Llorens, Leon Derczyn- ski, James Pustejovsky, and James Allen. 2013. Semeval-2010 task 13: Tempeval-2. In Second Joint Conference on Lexical and Computational Seman- tics (*SEM), Volume 2: Seventh International Work- shop on Semantic Evaluation (SemEval 2013), Se- mEval ’13, pages 1–9, Stroudsburg, PA, USA. As- sociation for Computational Linguistics. Marc Verhagen, Robert Gaizauskas, Frank Schilder, Mark Hepple, Graham Katz, and James Pustejovsky. 2007. SemEval-2007 Task 15: TempEval Temporal