<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>J. J. Saiz);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>End-to-End Temporal Relation Extraction in the Clinical Domain [FULL]</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>José Javier Saiz</string-name>
          <email>josejavier.saiz.anton@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Begoña Altuna</string-name>
          <email>begona.altuna@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Temporal Information Extraction, End-To-End Relation Extraction, Electronic Health Records</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>HiTZ Basque Center for Language Technologies - Ixa NLP Group, University of the Basque Country UPV/EHU</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of the Basque Country (UPV/EHU)</institution>
          ,
          <addr-line>Arriola Pasealekua 2, Donostia, Gipuzkoa, 20018</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Temporal relation extraction is an important task in the clinical domain, as it allows a better understanding of the temporal context of clinical events. In this paper, we present an end-to-end temporal relation extraction system for the clinical domain, using the i2b2 2012 Temporal Relation challenge as a benchmark. In our proposal, we fine-tune REBEL-a sequence-to-sequence model for general relation extractionwith temporal annotations and discharge summaries. Our proposal is then able to simultaneously extract relevant clinical entities, time expressions and the temporal relations between them. Our results demonstrate the efectiveness of this approach, achieving reasonable performance on the End-To-End In: R. Campos, A. Jorge, A. Jatowt, S. Bhatia, M. Litvak (eds.): Proceedings of the Text2Story'23 Workshop, Dublin (Republic of Ireland), 2-April-2023 ∗Corresponding author.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Patients’ medical information is stored in Electronic Health Records (EHRs), which contain
structured data (e.g. demographics, vital signs and test results) and free text, such as reports
and discharge summaries. The latter is the most informative, but the free text format makes
clinical narratives prone to information overload, redundancy and poor access to information.
This leads to ineficiencies [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and can ultimately impact patient care negatively.
      </p>
      <p>Extracting structured information from free-text clinical narratives can improve information
accessibility and enhance patient care by facilitating clinical workflow. For example, Temporal
Relation Extraction (TRE), which involves identifying events and time anchors according to
their temporal features and then classifying the relations between these entities, can be applied
to clinical timeline summarisation and ICD-10 code ordering, thus aiding medical research and
patient care.</p>
      <p>
        However, clinical TRE presents several challenges. Clinical narratives often have a high
density of technical information and a concise writing style, which can make language modelling
challenging [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In addition, the clinical lexicon and syntax can vary significantly across regions,
institutions and medical specialties, making it dificult to develop universal approaches.
      </p>
      <p>
        Our TRE system is designed to extract entities and temporal relations from clinical narratives
(as shown in Figure 1). It is an end-to-end approach because it performs all TRE tasks
simultaneously, namely temporal expression identification and temporal relation classification, as a
sequence-to-sequence problem. To achieve this, we fine-tune REBEL (Relation Extraction By
End-to-end Language generation) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], a pre-trained model based on the BART (Bidirectional
Auto-Regressive Transformer) architecture [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which extracts triplet sequences from general
domain text. We use the i2b2 Temporal Relation corpus, which contains clinical narratives
annotated with temporal information, to train and evaluate our system. Fine-tuning helps to
adapt the model to the specifics of clinical text and the task of extracting clinical temporal
relations with a small amount of annotated data and training time.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        TRE approaches have evolved from rule-based systems to traditional machine learning systems
with specialised classifiers and heuristics, and then to deep learning systems [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In the general
domain, recent approaches incorporate advanced DNN-based models capable of learning
highlevel representations for TRE. These methods can include advanced neural language models
such as BERT [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ], as well as graph-based architectures that capture the global structure of
temporal relations in a text and are able to tackle document-level relation extraction [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ].
      </p>
      <p>
        In the clinical domain, current state-of-the-art TRE systems are based on pre-trained BERT
models or variants of this architecture [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. However, recent approaches often lack flexibility
and ease of use because they typically focus on a specific task, such as relation classification [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11,
12, 13</xref>
        ], and are limited to building temporal relations from gold standard entities. Furthermore,
some only address a limited subset of relations [
        <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
        ], such as explicit intra-sentence temporal
relations, as seen in systems evaluated on the Direct Temporal Relations corpus from Lee et al.
[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. There is a need for further development of end-to-end strategies that can handle all types
of temporal relations with larger dependencies, including cross-sentence and implicit temporal
relations, which this work addresses.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Resources</title>
      <p>In this section, we describe the dataset and approach chosen for the development of our clinical
TRE system. We provide details on the dataset features, the representation of the input and
output data, the model architecture and the limitations of the approach.</p>
      <sec id="sec-3-1">
        <title>3.1. Dataset</title>
        <p>
          The dataset used in this work was developed by the Informatics for Integrating Biology and the
Bedside (i2b2) project for shared clinical NLP tasks and consists of 310 discharge summaries
annotated with temporal information. Discharge summaries are divided into two main sections:
clinical history (recent clinical history up to admission) and hospital course (hospital course and
treatment plan after discharge). Both sections include annotations for temporal information
based on the ISO-TimeML standard [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], including clinical events, time expressions, and
temporal relations. On average, a discharge summary contains 86.6 events, 12.4 time expressions, and
176 temporal relations [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. Table 1 provides statistics on the raw number of entities and types
in the whole corpus.
        </p>
        <p>
          The labels used in the dataset have types and attributes to properly represent and classify
text tokens, as defined and described by Sun et al. [ 19]:
• The EVENT tag represents relevant events or states in the patient’s clinical timeline, such
as “follow-up” or “admission”. It includes a “type” attribute to specify the event type and
“modality” and “polarity” attributes to indicate the certainty and valuation of the event.
• The TIMEX3 tag is used for temporal expressions, including dates, durations, times, and
frequencies, such as “March 21, 2021” or “3 hours”. TIMEX3 tags’ additional attributes
are “val”, that holds the normalized value of the time expression and “mod”, that holds
the time modifier value.
• The TLINK tag encodes three types of relations: overlap, before and after. TLINKs relate
each event to the document’s admission or discharge date (“Section Time TLINK”). In
addition, TLINKs connect EVENTS and TIMEX3s within the same sentence or across
multiple sentences (“Non-Section Time TLINKs”) .
3.2. Model
To develop our system, we fine-tuned REBEL (Relation Extraction By End-to-end Language
generation) [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], which is a pre-trained model based on the BART architecture. REBEL frames
relation extraction as a sequence-to-sequence task and is trained to produce a sequence of
relational facts from the input text. To represent relational facts, REBEL groups entities and
relations into triplet sequences and represents them in a linear text string using special marker
tokens. For example, consider the following text:
        </p>
        <p>“After the accident, the patient was admitted for surgery and rehab.”</p>
        <p>Here, we find two relational facts: “(accident, before, surgery)” and “(accident, before, rehab)”,
which are represented as the following sequence of triplets:</p>
        <p>“&lt;triplet&gt; accident &lt;prob&gt; surgery &lt;tret&gt; before &lt;prob&gt; rehab &lt;tret&gt; before”
Entity marker tokens, enclosed in angle brackets, indicate the order of the entities in a relation:
&lt;triplet&gt; indicates the start of a relation, while the subsequent tokens indicate whether the
entity they accompany is a head or tail within the relation. We elaborate on this in section 3.3.</p>
        <p>The triplet sequences, along with the corresponding context, are used as labels during training.
During inference, the objective of the model is to extract a sequence of triplets representing the
semantic relations contained in an input text.</p>
        <p>Formally, there is a text  and a sequence of relations  = ( 1, ...,   ) with  being the length of
 , that is, the proposed number of relations found in the input text  . The model must estimate
the value of  , i.e. yield  ,̂ that maximises the conditional probability ( |)
. This probability
can be decomposed as the joint probability of the sequence of relations involved, that is, as the
product of the probabilities of generating the relation   conditioned on the text  and also on
the previous relations found  &lt; , as shown in expression (1).


=1</p>
        <p>=̂ arg max ( | ) =
arg max ( 1, ...,   | ) = arg max ∏ (  |  &lt; , )
(1)</p>
        <p>By fine-tuning REBEL, our system would benefit from several advantages of the core
architecture that are most appropriate for clinical TRE. For one, REBEL can handle longer context
sequences (up to 1024 tokens) and extract relations that span multiple sentences. This is
important for capturing document time and cross-sentence relations, which are common in clinical
TRE corpora. Furthermore, REBEL does not require any entity annotation or pre-processing,
unlike other systems that may rely on entity recognition or linking modules. Finally, REBEL
can be easily adapted to specific domains and annotation schemes with little resources and
time, as it is pre-trained on a large number of relations. This is also helpful because there are
diferent annotation schemes for temporal relation extraction, and developing such specific
systems from scratch would be expensive.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.3. Data representation</title>
        <p>REBEL is pre-trained on a distantly supervised dataset of 220 relation types generated by linking
Wikidata entities and English Wikipedia abstracts. In order to fine-tune REBEL for the task
of temporal relation extraction, the small size of the training dataset posed the challenge of
accurately learning new entities and relations. Therefore, and given that REBEL was not trained
with temporal relations, we adapted the annotations in the i2b2 2012 corpus by using triplet
sequences with textual representations that the model learned during its pre-training phase.</p>
        <p>The relation types in the i2b2 2012 corpus were modified to match the textual form of the
pre-training relations. The most appropriate textual representations were selected on the basis
of semantic similarity and conciseness. For example, the relation type “after” was represented
as “follows” and “before” as “followed by” in the triplet sequences. In the case of the “overlap”
relation, we experimented with diferent representations such as “said to be the same as” or
“partially coincident with”. However, larger representations were found to propagate errors
more easily, so we finally chose “same as”.</p>
        <p>We also modified the entity marker tokens. The original REBEL model used only three
tokens to structure the triplet sequence: &lt;triplet&gt;, &lt;subj&gt; and &lt;obj&gt;. We added new tokens
that indicated the entity types as well. That is, for each of the 10 entity types in the i2b2 2012
corpus, we created a new entity marker token with a four-character abbreviation. For example,
&lt;tret&gt; for the EVENT type “treatment” and &lt;freq&gt; for the TIMEX3 type “frequency”. We added
these tokens to the embedding dimension to ensure that they were processed accurately.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>The following section describes the fine-tuning process and evaluation of our system. In order
to be comparable with other systems, we followed the rules of the i2b2 2012 Challenge and used
the split provided, which consists of a train set of 190 documents and a test set of 120 documents
from the i2b2 Temporal Relation corpus. Our modeling setup consisted of the following steps:
1. We segmented the 190 documents into texts of 512 characters each. This segmentation
method accounted for sentences with diferent lengths and facilitated a more consistent
division. Moreover, this also enabled us to utilize the maximum amount of text that
could fit within the memory constraints of our model. We also made sure that each
text contained the admission and discharge dates at the beginning, so that we could
also extract the Section Time TLINKs. This resulted in 1.139 training instances, each
comprising an input-output text pair, as illustrated by figure 2.</p>
      <p>2. We fine-tuned the pre-trained REBEL model by iteratively feeding the train instances.</p>
      <p>
        Our system was obtained at the last checkpoint after fine-tuning the base REBEL model for
10 epochs, and following the training parameters shown in Table 2. Then, the system was
evaluated with the End-To-End track from the i2b2 2012 Challenge [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The evaluation setup
consisted in the following process:
1. We divided the 120 documents in the test set into 512-character texts, each beginning
with the admission and discharge dates.
2. We fed each text into the fine-tuned model, which conditionally generated text sequences
of triplets.
3. We decoded the triplet sequences and compiled them into XML format, which were then
evaluated using the i2b2 2012 Temporal Evaluation Scripts. No additional post-processing
was performed.
      </p>
      <p>The BART architecture employs conditional text generation to generate predictions, which
can be influenced by modifying various decoding parameters. We chose the Beam Search
strategy among multiple decoding strategies available, as it produces more consistent results by
exploring and scoring multiple output sequences in parallel [20]. Since the order of the entities
and marker tokens afects the accuracy of the predictions, consistency of the predicted tokens is
crucial to prevent errors from propagating to the subsequent sequence of text. The Beam Search
strategy considers multiple output sequences simultaneously and generates subsequent tokens
based on the top-k sequences with the highest scores, ensuring that the extracted relations
are consistent with the input sequence. The parameters used to generate the predictions for
evaluation are shown in Table 2. For reproducibility purposes, the code to replicate the dataset
and modelling setup is available on our GitHub page under the CC BY-SA-NC 4.0 licence 1. For
confidentiality reasons, data and evaluation scripts are only available on request from the n2c2
organisation.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>
        To be consistent with previous benchmarks, we adopt the TempEval3 evaluation metrics used in
the original i2b2 2012 challenge [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], which calculates the Precision, Recall and Micro-average
F1 scores. Here, the evaluation metrics difer from the standard F1 used for standard multi-class
settings in that Precision is computed by verifying each predicted relation against the transitive
closure of the gold standard, and Recall is computed by verifying each gold standard relation
against the transitive closure of the predictions. Table 3 shows our system’s results relative to
those of the best performing systems in the i2b2 2012 challenge.
1https://github.com/jsaizant/ETEREX-REBEL
      </p>
      <p>Our system delivered mixed results when tested on the i2b2 2012 dataset. In the EVENT and
TIMEX3 extraction tasks, the system’s accuracy was 0.78 and 0.77 respectively, which is below
the best performing systems. However, in the TLINK extraction task, the system achieved an
accuracy of 0.58, ranking third among the best performing systems. In terms of architecture,
the other three systems use a pipeline approach consisting of multiple CRF and SVM classifiers
and rule-based methods for time expression detection and/or normalisation [21, 22, 23]. It is
also worth noting that while our experiments are limited to the i2b2 2012 corpus, they used
additional corpora to build their event classifiers: Tang et al. [21] and Xu et al. [22] used the
i2b2 2010 corpus and Roberts et al. [23] used additional text resources from PubMed, Wikipedia
and other medical records. Comparatively, our training data is more restricted, which serves to
demonstrate the adaptability of our approach.</p>
      <p>Despite its poor performance on the EVENT and TIMEX3 tracks, the system’s performance
on the TLINK extraction is relatively stronger, suggesting that it excels at recognising temporal
relations between entities rather than recognising the entities themselves. The results are in
line with expectations for a system performing the event, time expression and temporal relation
tasks simultaneously. The TRE task has traditionally consisted of a first subtask of entity
extraction and a second subtask of relation extraction, and high performance in the first was
crucial for good results in the second. Our system, fine-tuned directly for relation extraction,
does not seem to be so dependent. However, improving the system’s ability to recognise clinical
and temporal entities could potentially improve its performance in the TLINK extraction task.</p>
      <p>In accordance with the overall extraction results, the remarkably low results in the
classification of event and time expression types do not seem to be crucial for the temporal relation
extraction task. However, taking into account the semantic information encoded in the entity
type could also help in the classification of temporal relations.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Discussion</title>
      <p>Our system uses the REBEL framework for Temporal Relation Extraction, a high-level NLP task
that requires contextual temporal information. Since REBEL uses the BART architecture, the
bi-directional encoder and attention mechanism allows the model to focus on specific parts of
the input text and weigh their importance in relation to the task at hand, giving the model a
greater capacity to process longer texts. The system’s capabilities have been further enhanced
by a fine-tuning process using contextual texts and triplet sequences with lengths of up to 1024
embedded tokens. These longer sequences allow the temporal context of the information to be
better captured.</p>
      <p>The i2b2 2012 corpus contains the “BEFORE” type as the most frequent TLINK type. To
assess the performance of our proposal, we perform an evaluation using a test set consisting of
pairs of entities predicted by the model and the most frequent TLINK class, i.e. the “BEFORE”
type. We refer to this evaluation as the baseline evaluation, and its purpose is to determine
whether the performance of the model is influenced by the most frequent class, and whether
this influence afects the performance on the other classes. The baseline score shows precision
and recall values of 0.50 and 0.25 respectively, and its F-score is 0.33, which is 0.25 lower than
the F-score of our model prediction. Although the proposed model performed better than the
baseline evaluation, the diference in F-scores was not significant. This suggests that the model
is indeed biased by the most frequent TLINK class and that there is still potential to improve the
overall performance of the model. While the model successfully identified most of the entity
pairs, the classification of the TLINK type was challenging. Indeed, most of the narratives tend
to be written in the past tense, which favours the temporal order in a certain direction. This
tendency creates a class imbalance in the temporal annotations, leading to a bias in the model’s
predictions, which is where further work is most needed and what we discuss next.</p>
      <p>Despite the large number of temporal annotations in the training dataset, not all the possible
relations are considered, making it dificult to train a system that produces consistent predictions.
For example, consider the set of entities in Figure 3. There may be several ways to label the
relations between them, such as “A” before “B” and “B” before “C”, or “B” after “A” and “C” after
“B”. In both cases, the relation between “A” and “C” remains unlabelled because not all relations
are explicitly annotated. This carries over to system inference, where we find that reciprocal
relations are not identified at all, and transitive relations are usually overlooked.</p>
      <p>To overcome the sparsity of TLINK annotations in the training set and to improve the
performance of the system, we propose two solutions. First, we propose the use of transitive
closure, which derives implicit relations from existing labelled relations, thereby increasing
the number of annotations. Together with the integration of reciprocal relations into the
training instances, this approach has been shown to mitigate the imbalance of TLINK types
and improve system performance [24]. Secondly, we propose to use a training corpus with
narrative container annotations [25], such as the E3C corpus [26], which prevents redundant
relations, but also reduces the distance of temporal dependencies, narrowing the context needed
for relation identification.</p>
      <p>In section 5 we observed that the extraction of relations in our system is less dependent on the
extraction of event and time expressions than traditional TRE systems. However, better entity
identification could improve the overall performance. For sequence-to-sequence architectures
such as our system, it is crucial to find the most efective textual representation of information.
As explained in section 3.3, we adapted the annotations of the i2b2 2012 corpus by matching
TLINK types to similar pre-training relations and creating abbreviated entity markers, which
proved to be a successful adaptation. To further improve entity extraction, a potential solution
is to assign token embeddings from the pre-trained model to the new entity markers. This
will improve entity extraction by using existing weights and biases for entity markers such as
&lt;tret&gt; and &lt;freq&gt; rather than training new weights from scratch. Indeed, further exploration
of diferent token representation techniques is needed to understand their impact on model
performance.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>This article presents our end-to-end system for clinical Temporal Relation Extraction, developed
by fine-tuning REBEL with temporal annotations and discharge summaries from the i2b2
Temporal Relation corpus. Our system uses a sequence-to-sequence approach that annotates
raw clinical narratives and extracts relations between entities in a single step, rather than
relying on multiple mechanisms and highly engineered linguistic features. This makes our
system more (re)usable and less dependent on specific texts or tasks. We have evaluated our
system in the End-To-End track of the i2b2 2012 Challenge and achieved reasonable results,
showing that our approach can handle complex clinical domains with limited resources and
time. We plan to explore ways to improve our system in future work, such as using pre-trained
token embeddings for the entity markers in the triplet sequences and training the system on a
corpus of narrative containers.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This work has been partially funded by the Basque Government postdoctoral grant POS 2022 2
0024. We would also like to thank Arantza Casillas and Alicia Pérez for their advice and help.
i2b2 Challenge, Journal of the American Medical Informatics Association 20 (2013).
doi:10.1136/amiajnl-2013-001628.
[19] W. Sun, A. Rumshisky, O. Uzuner, Annotating temporal information in clinical narratives,</p>
      <p>Journal of Biomedical Informatics 46 (2013). doi:10.1016/j.jbi.2013.07.004.
[20] S. Welleck, I. Kulikov, S. Roller, E. Dinan, K. Cho, J. Weston, Neural Text Generation with
Unlikelihood Training, CoRR abs/1908.04319 (2019). URL: http://arxiv.org/abs/1908.04319.
arXiv:1908.04319.
[21] B. Tang, Y. Wu, M. Jiang, Y. Chen, J. C. Denny, H. Xu, A hybrid system for temporal
information extraction from clinical text, Journal of the American Medical Informatics
Association 20 (2013) 828–835. URL: https://doi.org/10.1136/amiajnl-2013-001635. doi:10.
1136/amiajnl-2013-001635.
[22] Y. Xu, Y. Wang, T. Liu, J. Tsujii, E. I.-C. Chang, An end-to-end system to identify temporal
relation in discharge summaries: 2012 i2b2 challenge, Journal of the American Medical
Informatics Association 20 (2013) 849–858. URL: https://doi.org/10.1136/amiajnl-2012-001607.
doi:10.1136/amiajnl-2012-001607.
[23] K. Roberts, B. Rink, S. M. Harabagiu, A flexible framework for recognizing events, temporal
expressions, and temporal relations in clinical text, Journal of the American Medical
Informatics Association 20 (2013) 867–875. URL: https://doi.org/10.1136/amiajnl-2013-001619.
doi:10.1136/amiajnl-2013-001619.
[24] G. Alfattni, N. Peek, G. Nenadic, Extraction of temporal relations from clinical free text: A
systematic review of current approaches, Journal of Biomedical Informatics 108 (2020).
doi:10.1016/j.jbi.2020.103488.
[25] J. Pustejovsky, A. Stubbs, Increasing Informativeness in Temporal Annotation, in:
Proceedings of the 5th Linguistic Annotation Workshop, Association for Computational Linguistics,
Portland, Oregon, USA, 2011, pp. 152–160. URL: https://aclanthology.org/W11-0419.
[26] B. Magnini, B. Altuna, A. Lavelli, M. Speranza, R. Zanoli, The E3C project: Collection and
annotation of a multilingual corpus of clinical cases, CEUR Workshop Proceedings 2769
(2020). doi:10.4000/books.aaccademia.8663.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mathioudakis</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Rousalova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Gagnat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Saad</surname>
          </string-name>
          , G. Hardavella,
          <article-title>How to keep good clinical records</article-title>
          ,
          <source>Breathe</source>
          <volume>12</volume>
          (
          <year>2016</year>
          ). doi:
          <volume>10</volume>
          .1183/20734735.018016.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tiwari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          , et al.,
          <article-title>Pre-trained Language Models in Biomedical Domain: A Survey from Multiscale Perspective</article-title>
          , arXiv e-prints (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P. L. H.</given-names>
            <surname>Cabot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Navigli</surname>
          </string-name>
          , REBEL:
          <article-title>Relation Extraction by End-to-end Language generation, Findings of the Association for Computational Linguistics</article-title>
          ,
          <source>Findings of ACL: EMNLP</source>
          <year>2021</year>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .findings-emnlp.
          <volume>204</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghazvininejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          , L. Zettlemoyer, BART:
          <article-title>Denoising Sequence-to-Sequence Pre-training for Natural Language Generation</article-title>
          , Translation, and
          <string-name>
            <surname>Comprehensio</surname>
          </string-name>
          , Computer Methods and Programs in Biomedicine (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>703</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y. B.</given-names>
            <surname>Gumiel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. E. S. E.</given-names>
            <surname>Oliveira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Claveau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Grabar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. C.</given-names>
            <surname>Paraiso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Moro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Carvalho</surname>
          </string-name>
          ,
          <source>Temporal Relation Extraction in Clinical Texts, ACM Computing Surveys</source>
          <volume>54</volume>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1145/3462475.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Extracting Temporal Event Relation with Syntax-guided Graph Transformer, in: Findings of the Association for Computational Linguistics: NAACL 2022, Association for Computational Linguistics</article-title>
          , Seattle, United States,
          <year>2022</year>
          , pp.
          <fpage>379</fpage>
          -
          <lpage>390</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .findings-naacl.
          <volume>29</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2022</year>
          . findings-naacl.
          <volume>29</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Durrett, Efective Distant Supervision for Temporal Relation Extraction</article-title>
          , CoRR abs/
          <year>2010</year>
          .12755 (
          <year>2020</year>
          ). URL: https://arxiv.org/abs/
          <year>2010</year>
          .12755. arXiv:
          <year>2010</year>
          .12755.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>B.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hsu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <article-title>Temporal Relation Extraction with a Graph-Based Deep Biafine Attention Model</article-title>
          ,
          <source>CoRR abs/2201</source>
          .06125 (
          <year>2022</year>
          ). URL: https://arxiv.org/abs/2201. 06125. arXiv:
          <volume>2201</volume>
          .
          <fpage>06125</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>X.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Xuan</surname>
          </string-name>
          ,
          <article-title>Event temporal relation extraction with attention mechanism and graph neural network</article-title>
          ,
          <source>Tsinghua Science and Technology</source>
          <volume>27</volume>
          (
          <year>2022</year>
          )
          <fpage>79</fpage>
          -
          <lpage>90</lpage>
          . doi:
          <volume>10</volume>
          .26599/TST.
          <year>2020</year>
          .
          <volume>9010063</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A. L.</given-names>
            <surname>Olex</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. T. McInnes</surname>
          </string-name>
          ,
          <article-title>Review of Temporal Reasoning in the Clinical Domain for Timeline Extraction: Where we are and where we need to be</article-title>
          ,
          <source>Journal of Biomedical Informatics</source>
          <volume>118</volume>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .1016/j.jbi.
          <year>2021</year>
          .
          <volume>103784</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dligach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Sadeque</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bethard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Savova</surname>
          </string-name>
          ,
          <article-title>A bert-based one-pass multi-task model for clinical temporal relation extraction, Association for Computational Linguistics (</article-title>
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .bionlp-
          <volume>1</volume>
          .7.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yan</surname>
          </string-name>
          , R. Han,
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Caufield</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ping</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Clinical Temporal Relation Extraction with Probabilistic Soft Logic Regularization</article-title>
          and
          <string-name>
            <given-names>Global</given-names>
            <surname>Inference</surname>
          </string-name>
          ,
          <year>2020</year>
          . URL: https://arxiv.org/abs/
          <year>2012</year>
          .08790. doi:
          <volume>10</volume>
          .48550/ARXIV.
          <year>2012</year>
          .
          <volume>08790</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>H. U.</given-names>
            <surname>Haq</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kocaman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Talby</surname>
          </string-name>
          ,
          <source>Deeper Clinical Document Understanding Using Relation Extraction</source>
          ,
          <year>2021</year>
          . arXiv:arXiv:
          <fpage>2112</fpage>
          .
          <fpage>13259</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>H.</given-names>
            <surname>Guan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Devarakonda</surname>
          </string-name>
          ,
          <source>Robustly Pre-trained Neural Model for Direct Temporal Relation Extraction</source>
          ,
          <year>2020</year>
          . arXiv:arXiv:
          <year>2004</year>
          .06216.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>G.</given-names>
            <surname>Alfattni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Peek</surname>
          </string-name>
          , G. Nenadic,
          <article-title>Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries</article-title>
          ,
          <source>Journal of Biomedical Informatics</source>
          <volume>123</volume>
          (
          <year>2021</year>
          )
          <article-title>103915</article-title>
          . URL: https://www.sciencedirect.com/science/ article/pii/S1532046421002446. doi:https://doi.org/10.1016/j.jbi.
          <year>2021</year>
          .
          <volume>103915</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>H.-J. Lee</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Tao</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>Identifying direct temporal relations between time and events from clinical notes</article-title>
          ,
          <source>BMC Medical Informatics and Decision Making</source>
          <volume>18</volume>
          (
          <year>2018</year>
          ). URL: https://doi.org/10.1186/s12911-018-0627-5. doi:
          <volume>10</volume>
          .1186/s12911-018-0627-5.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pustejovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bunt</surname>
          </string-name>
          , L. Romary,
          <string-name>
            <surname>ISO-TimeML</surname>
          </string-name>
          :
          <article-title>An international standard for semantic annotation</article-title>
          ,
          <source>Proceedings of the 7th International Conference on Language Resources and Evaluation</source>
          ,
          <string-name>
            <surname>LREC</surname>
          </string-name>
          <year>2010</year>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>W.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rumshisky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Uzuner</surname>
          </string-name>
          ,
          <article-title>Evaluating temporal relations in clinical text: 2012</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>