-

Padua, Italy " hugo.o.sousa@inesctec.pt (H. O. Sousa)

Temporal Relation Extraction: The Event Ordering Task

Hugo O. Sousa

0 0 INESC TEC , Portugal, R. Dr. Roberto Frias, Porto

2021

000 0 0003

Although most Natural Language Processing tasks, such as Text Classification and Natural Language Translation, have experienced a major performance improvement due to recent advances in neural network architectures, Temporal Relation Extraction remains an open challenge. This leaves the door open for new research questions. In this paper, we provide a brief summary of the task and some of the recent eforts that have been made to solve it. In addition, some research opportunities yet to be explored are also discussed.

eol>Temporal Relation Extraction Information Retrieval Natural Language Processing

1. Temporal Relation Extraction the SemEval competitions, most notably the TempEval shared tasks held in three diferent years 2007 [ 11], 2010 Temporal Relation Extraction (TRE) is a Natural Lan- [12], and 2013 [13]. guage Processing (NLP) task focused on classifying the Due to the low annotator agreement, the tendency temporal relationship between entities, typically events over the years has been to simplify and refine the anor temporal expressions, found in a text. A model that notation scheme. For example, TimeBank was annocan accurately classify such relationships would be able tated with all 13 Allen interval relations [14], whereas to place events in a timeline, making it temporal-aware. in TimeBank-Dense the relation set consisted of only 6 This temporal knowledge could then be used in any time- interval relations. Also in MATRES [8], the authors argue sensitive NLP task, such as text summarization, natural that the inter-annotator agreement was much lower for language translation, question answering, or used more the relation between the end-points of events, so they widely in knowledge bases. Despite many eforts in re- decided to focus the annotation only on the start-points. cent years, neural network architectures fail to make the The considerable number of datasets and annotation leap in efectiveness already seen in other NLP tasks, thus schemes makes it dificult to determine which model is making TRE an open challenge. the state of the art in TRE. To this regard, we have been

The roots of this task can be traced back to 2002, the working to create a Python package to facilitate comyear the TERQAS workshop took place. This workshop parison between diferent models. This will provide a produced two important results: the Time Markup Lan- common ground between them that the research comguage (TimeML) [1], the first annotation scheme that munity can build upon. annotates temporal relationships; and TimeBank [2], But it seems that despite many eforts made in recent the first corpus annotated with temporal relationships. years to train deep neural networks [15, 16, 17, 18], the Since then, many annotation schemes and datasets have state of the art models often rely on hand-craft rules been proposed. Some with the aim of making the an- [19, 20, 21, 17, 22] that are domain-specific and laborious notation more complete as in TimeBank-Dense [3] and to develop. Another approach to TRE is to train the model TDDiscourse [4], others to cope with the specificities of to identify the absolute time at which each event occurred other languages, such as the French TimeBank [5], the in the narrative [19, 23]. After identifying the absolute Portuguese TimeBank [6] and the Hindi TimeBank time of each event, they can be placed in a timeline, where [7]. MATRES [8] is a more comprehensive efort, with inferring their relations is trivial. the authors annotating multiple time axis of the text. In addition, domain-specific datasets were also annotated, such as THEE [9] for event-based surveillance systems in 2. Research Questions public health and THYME [10] for health records. Another efort that was of great importance for the TRE task were

There are many research questions for this task, that are worth to be discussed. For example, classifying all 13 Allen interval relationships is typically dificult due to the fact that most relationships are underrepresented, leading to an unbalanced dataset. This problem can be solved by transforming the interval relations into point relations between the start and endpoints of each interval [24]. Doing this will result in only three relations: EQUAL, [5] A. Bittar, P. Amsili, P. Denis, L. Danlos, French BEFORE or AFTER. Making it easier to train the model. timebank: an iso-timeml annotated reference cor

When computing temporal closure [25] in a dataset pus, in: Proceedings of the 49th Annual Meeting annotated with Allen relations, it is common to derive of the Association for Computational Linguistics: relations that may have more than one relation. For exam- Human Language Technologies, 2011, pp. 130–134. ple, if A, B and C are events and the annotation says that [6] F. Costa, A. Branco, Temporal information processA OVERLAPS B and B OVERLAPS C than, the relation ing of a new language: Fast porting with minimal between A and C can be MEETS, OVERLAPS or BEFORE. resources, in: Proceedings of the 48th Annual MeetThis opens the door for another possibility yet to be ex- ing of the Association for Computational Linguisplored that is to stage the problem as a Reinforcement tics, 2010, pp. 671–677.

Learning task [26]. In this framework, we can take full [7] P. Goel, S. Prabhu, A. Debnath, P. Modi, M. Shriadvantage of temporal closure by rewarding the model vastava, Hindi timebank: An iso-timeml annotated for any of the three relationships, whereas this would reference corpus, in: 16th Joint ACL-ISO Workshop not be possible in conventional deep neural networks. on Interoperable Semantic Annotation PROCEED

Another interesting approach that could be promis- INGS, 2020, pp. 13–21. ing are Graph Neural Networks (GNN) [27]. Temporal [8] Q. Ning, H. Wu, D. Roth, A multi-axis annotation relations have the natural structure of a graph, where scheme for event temporal relations, arXiv preprint the nodes are events or temporal expressions and the arXiv:1804.07828 (2018). edges are the relations between them. GNN have demon- [9] J. Niu, V. Ng, G. Penn, E. E. Rees, Temporal histories strated the ability to take advantage of this rich structure, of epidemic events (thee): a case study in tempomaking it a promising avenue for future research. ral annotation for public health, in: Proceedings of The 12th Language Resources and Evaluation Conference, 2020, pp. 2223–2230.

Acknowledgement [10] W. F. Styler IV, S. Bethard, S. Finan, M. Palmer, S. Pradhan, P. C. De Groen, B. Erickson, T. Miller, This work has been carried out as part of the project C. Lin, G. Savova, et al., Temporal annotation in Text2Story, financed by the ERDF European Regional the clinical domain, Transactions of the association Development Fund through the North Portugal Regional for computational linguistics 2 (2014) 143–154. Operational Programme (NORTE 2020), under the POR- [11] M. Verhagen, R. Gaizauskas, F. Schilder, M. HepTUGAL 2020 and by National Funds through the Por- ple, G. Katz, J. Pustejovsky, Semeval-2007 task 15: tuguese funding agency, FCT - Fundação para a Ciência e Tempeval temporal relation identification, in: Proa Tecnologia within project PTDC/CCI-COM/31857/2017 ceedings of the fourth international workshop on (NORTE-01-0145-FEDER-03185). semantic evaluations (SemEval-2007), 2007, pp. 75– 80.

References [12] M. Verhagen, R. Sauri, T. Caselli, J. Pustejovsky, Semeval-2010 task 13: Tempeval-2, in: Proceed[1] J. Pustejovsky, J. M. Castano, R. Ingria, R. Sauri, ings of the 5th international workshop on semantic R. J. Gaizauskas, A. Setzer, G. Katz, D. R. Radev, evaluation, 2010, pp. 57–62.

Timeml: Robust specification of event and temporal [13] N. UzZaman, H. Llorens, L. Derczynski, J. Allen, expressions in text., New directions in question M. Verhagen, J. Pustejovsky, Semeval-2013 task 1: answering 3 (2003) 28–34. Tempeval-3: Evaluating time expressions, events, [2] J. Pustejovsky, P. Hanks, R. Sauri, A. See, and temporal relations, in: Second Joint Conference R. Gaizauskas, A. Setzer, D. Radev, B. Sundheim, on Lexical and Computational Semantics (* SEM), D. Day, L. Ferro, et al., The timebank corpus, in: Volume 2: Proceedings of the Seventh International Corpus linguistics, volume 2003, Lancaster, UK., Workshop on Semantic Evaluation (SemEval 2013), 2003, p. 40. 2013, pp. 1–9. [3] T. Cassidy, B. McDowell, N. Chambers, S. Bethard, [14] J. F. Allen, Maintaining knowledge about temporal An annotation framework for dense event order- intervals, Communications of the ACM 26 (1983) ing, Technical Report, CARNEGIE-MELLON UNIV 832–843.

PITTSBURGH PA, 2014. [15] M. D. Ma, J. Sun, M. Yang, K.-H. Huang, N. Wen, [4] A. Naik, L. Breitfeller, C. Rose, Tddiscourse: A S. Singh, R. Han, N. Peng, Eventplus: A tempodataset for discourse-level temporal ordering of ral event understanding pipeline, arXiv preprint events, in: Proceedings of the 20th Annual SIG- arXiv:2101.04922 (2021). dial Meeting on Discourse and Dialogue, 2019, pp. [16] Q. Ning, S. Subramanian, D. Roth, An improved 239–249. neural baseline for temporal relation extraction, arXiv preprint arXiv:1909.00429 (2019). [17] R. Han, I. Hsu, M. Yang, A. Galstyan, R. Weischedel,

N. Peng, et al., Deep structured neural network for event temporal relation extraction, arXiv preprint arXiv:1909.10094 (2019). [18] H. Wang, M. Chen, H. Zhang, D. Roth, Joint constrained learning for event-event relation extraction, arXiv preprint arXiv:2010.06727 (2020). [19] Q. Do, W. Lu, D. Roth, Joint inference for event timeline construction, in: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural

Language Learning, 2012, pp. 677–687. [20] Q. Ning, Z. Feng, D. Roth, A structured learning approach to temporal relation extraction, arXiv preprint arXiv:1906.04943 (2019). [21] Q. Ning, Z. Feng, H. Wu, D. Roth, Joint reasoning for temporal and causal relations, arXiv preprint arXiv:1906.04941 (2019). [22] R. Han, Q. Ning, N. Peng, Joint event and temporal relation extraction with shared representations and structured prediction, arXiv preprint arXiv:1909.05360 (2019). [23] A. Leeuwenberg, M.-F. Moens, Towards extracting absolute event timelines from english clinical reports, IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 (2020) 2710–2719. [24] C. Freksa, Temporal reasoning based on semi

intervals, Artificial intelligence 54 (1992) 199–227. [25] M. Verhagen, Times between the lines, Brandeis

University, Massachusetts (2004). [26] R. S. Sutton, A. G. Barto, Reinforcement learning:

An introduction, MIT press, 2018. [27] J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu,

M. Sun, Graph neural networks: A review of methods and applications, CoRR abs/1812.08434 (2018). URL: http://arxiv.org/abs/ 1812.08434. arXiv:1812.08434.