<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>SEMMES: Semantic Methods for Events and Stories, May</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Data Augmentation for Semantically-Precise Event Relation Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Youssra Rebboud</string-name>
          <email>youssra.rebboud@eurecom.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pasquale Lisena</string-name>
          <email>pasquale.lisena@eurecom.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Raphaël Troncy</string-name>
          <email>raphael.troncy@eurecom.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Event Relation, Information Extraction, Knowledge Graphs, Machine Learning</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>EURECOM</institution>
          ,
          <addr-line>Sophia Antipolis</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>2</volume>
      <fpage>8</fpage>
      <lpage>23</lpage>
      <abstract>
        <p>The process of recognizing and classifying the relationships between events mentioned in the text is a crucial task in natural language processing (NLP) known as event relation extraction. If temporal relations and causality are largely studied in the literature, other types of relations have found less interest. Our study specifically concentrates on four types of event relations: causality, enabling, prevention, and intention. Our main contribution consists of the use of a state-of-the-art language model (GPT-3) to extend an existing small dataset with synthetic examples to address the challenge of insuficient training data. We evaluate the quality of these generated samples by training an event relations extraction system, showing improved performances in classifying event relations.</p>
      </abstract>
      <kwd-group>
        <kwd>Classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Relation extraction (RE) – the identification and classification of relationships between two
named entities in raw text – is a classic natural language processing (NLP) task which is
receiving attention from the scientific community [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. A more specific branch of RE aims
to automatically detect the relations between events (Event Relation Extraction, ERE). While,
in RE, the entity type is crucial for inferring the correct relation1, in ERE the relations involve
homogeneous entities (namely, pairs of events), requiring specialised methods and approaches.
Among event relation, the literature mostly focused on temporal relations [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], causality [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and
coreference of the same event in diferent textual resources [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Apart from causal and temporal structures, event relations may include diferent concepts
such as prevention, intention, enabling, etc. Extracting this variety of relations may serve
various downstream applications, including semantic timelines, question answering, and fact
checking, supporting the decision making and improving information and entertainment. Apart
LGOBE
(R. Troncy)
by” or “enrolled in”.
https://github.com/RYoussra (Y. Rebboud); http://pasqlisena.github.io/ (P. Lisena);
CEUR
Workshop
Proceedings
from a demonstrative proof-of-concept [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the automatic extraction of diferent kind of relations
have not been deeply investigated. The first step toward achieving this objective is to develop a
dedicated dataset. Although [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] attempts to construct such an initial dataset for multiple event
relations, the result is particularly small in size and exhibited significant imbalances.
      </p>
      <p>
        The recent advent of generative models has marked a paradigm shift in Natural Language
Processing with their capabilities to generate human-like complex text relying only on the
provided prompt. In this work, we aim to use prompt-based solutions to extend The Event
Relations Dataset [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] with synthetic sentences including references to event relations, with a
particular focus on causality, prevention, intention, and enabling. This dataset will then be used
to further evaluate state-of-art techniques based on machine learning, to check if they can be
successfully applied in the prediction of event relation types.
      </p>
      <p>This work has two research objectives:
• To investigate if prompt-based generative models are suitable for generating synthetic
data for the purpose of populating a dataset of event relation, particularly for such kind
of rare and very specific event relation types.
• Evaluate the performance of methods based on language models in predicting event
relations when trained on synthetic data.</p>
      <p>The remainder of this paper is structured as follows. First, we review the existing event
relations datasets and event relations extraction in Section 2. We describe our approach for
constructing a synthetic event relations dataset by proving GPT-3 with prompts to generate
sentences that hold prevention, intention, and enabling relations as well as their constructs in
Section 3. Section 4 details a method for extracting event relations from the generated dataset,
whose results are discussed in Section 5. Finally, we conclude and outline some future work in
Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Events and Event Relationships Datasets</title>
        <p>
          Within the field of events and events relationships, several datasets have been developed with
the main objective of capturing events, event coreferences, causal and temporal relations.
For instance, ACE 20052 for event extraction, and TimeBank [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and CausalTimeBank [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
respectively for temporal and causal events relationship extraction.
        </p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], the FARO ontology for representing event relations has been introduced, as a
harmonisation of diferent models from the literature, including definitions for all relations. In
addition, a first Event Relation dataset covering four event relation types: caus lity, prevention,
enabli ng, and intention is presented in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. With the exception of causality, these relation types
are absent in other annotated datasets. Due to the small size of the dataset and its imbalance,
training a model on top of it presented a significant challenge. 3
        </p>
        <p>
          To address this gap, our work aims to increase the size of this dataset using data augmentation
techniques. In prior research, numerous techniques were elaborated for data augmentation
2https://catalog.ldc.upenn.edu/LDC2006T06
3More details are provided in Section 3.1
within the field of events and events relationships extraction [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] such as distant supervision [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]
and translation [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. In this work, we intend to leverage capabilities of generative models such
as GPT-3 [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. It is worth to mention that, at the time of writing, no oficial API for ChatGPT is
available.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Events and Events Relationship Extraction</title>
        <p>
          Numerous techniques were adopted to tackle the event relation extraction problem from a
general point of view regardless of the entity type – event, person, location, etc. –, including
supervised, unsupervised, semi-supervised, and distant supervision approaches [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. Each
approach has advantages and limitations: supervised approaches heavily rely on large training
datasets; on the other hand, unsupervised approaches fall short in labeling the identified clusters
– introducing a barrier for human understanding – and in finding unified evaluation metrics;
distant supervision approaches are based on entity alignment between a corpus and and a
knowledge base, but demonstrated low accuracy scores due to a bad precision.
        </p>
        <p>
          Particularly, the Cross-Modal Attention Network [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] achieved state-of-the-art performances
by simultaneously learning two tasks: entity recognition and relation classification. The
approach involves injecting the token-level information into entity tags, rather than concatenating
token and label representations.
        </p>
        <p>
          In the literature, the extraction of events and their relationships studies mostly a subset
of possible relations, namely causality, temporality and coreference [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Previous studies
demonstrated that models based on the combination of CNN, LSTM and attention mechanism
are able to capture causal dependencies, even when the cause and efect are separated by a
significant distance within the sentence [ 14].
        </p>
        <p>Event extraction has been made possible using pretrained language models such as BERT
[15], as shown in [16]. SpanBERT [17] – an improved version of BERT that excels in predicting
text spans instead of single words – has also been employed for event extraction, resulting in
notable performance gains [18].</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Building a Synthetic Event Relations Dataset with GPT-3</title>
      <p>The Event Relation dataset (Section 2.1) is the only available dataset including multiple event
relation types. In this section, we describe our eforts for overcoming the two most important
limitations of this dataset: its size and the large unbalance between relation types.</p>
      <p>Our data augmentation strategy for expanding the dataset is based on the automatic
generation of sentences using a prompt-based model. Using the right prompt as input, the model
would provide new synthetic sentences for enriching the dataset.</p>
      <p>
        We use the GPT-3 language model [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], and more precisely the GPT-3.5 text-davinci-003
variant as described in the OpenAI documentation.4 We are interested in generating sentences
that involve events and relationships between them, particularly those related to prevention,
intention, and enabling.
4https://platform.openai.com/docs/guides/completion
      </p>
      <sec id="sec-3-1">
        <title>3.1. Starting Point: The Event Relations Dataset</title>
        <p>
          The Event Relations Dataset [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] — later named in this paper the Original Dataset – describes some
of the FARO event relation types. It represents the first events and events relations dataset that
encapsulates diferent event relations, ranging from temporal to causal, and extending beyond
causality to include intention, prevention, enabling, and the explicit negation of causality – that
we will not cover in this work, because it would require a separate discussion. The construction
of the dataset was done by manually re-annotating two existing datasets, TimeBank [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], and
[19], which previously only included temporal and causal relations. The dataset was afterwards
extended with more samples for prevention and enabling relation using the same manual
validation technique.
        </p>
        <p>The annotation of the aforementioned new event relations types involved also the annotation
of the constructs of each relation, which we refer to them in the following as event triggers. It is
worth mentioning that these event triggers belong to a bigger class in FARO called Relata, which
is an abstraction encompassing two sub classes: Event (immanent) and Condition (transcendent).
In this paper, we consider a subset of events that acts as triggers for preventing, causing or
intending to cause other events, and a condition that enables the happening of another event.
Example 1. “The move boosts Intelogic Chairman Asher Edelman’s stake to 20% from 16.2%
and may help prevent Martin Ackerman from making a run at the computer-services concern.”
 
prevents
−−−−−−−⟶  
In Example 1., there exists an event relationship of type prevention, and the two event triggers
that participate in the relation are move and run of type Event.</p>
        <p>Example 2. “The government of Prime Minister Brian Mulroney has been under pressure
to reduce the deficit, which is expected to reach C$30 billion this year.”
   
enables
−−−−−−⟶  
In Example 2., the event relationship involves the type of enabling, with the two corresponding
event triggers being pressure and reduce, in which pressure is an event trigger of type
Condition and reduce is of type Event.</p>
        <p>Table 1 summarizes the number of event relations per relation type in the Original Dataset
after extending it with news agency samples.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Prompt-based Sample Generation of Sentences</title>
        <p>When designing the prompt utilized to generate synthetic examples for a specific relation type,
we include:
1. the definition that the FARO ontology assigns to that relation type;
2. a subset of relevant examples from the dataset.</p>
        <p>We consider a sequence of words Xi = [x1, xt1, …, t2, xn], representing an event relationship
occurring between two Relata, of a specific relation type ER x. The words xt1 and xt2 respectively
represents in the text the two Relata which are the subject and the object of the relations. The
definition of the relation type definition(ER x) is taken from the FARO ontology.</p>
        <p>The selection of the prompt is done after a series of attempts. For sentences generation,
we started by leveraging only the task description in the prompt. Therefore, the generated
sentences where too short and basic, while we need realistic and longer sentences, similarly to
those in the Original Dataset.</p>
        <p>Table 2 demonstrates an efort to prompt the model to produce sentences that showcase
connection between events with the desired relation type, but the resulting answer falls short
of meeting our intended expectations.</p>
        <p>The prompt text to generate sample sentences including relations of type ERx is written as
the following:</p>
        <p>Prompt(ERx) = definition(Event) + definition(ER x) + request(ER) + examples(ERx)
This prompt definition concerns prevention and intention relations. In the context of enabling
relation, we include the definition of a condition as follows:</p>
        <p>Prompt(ERenable) = definition(Event) + definition(Condition) + definition(ER enable) +
request(ER) + examples(ERx)
where request(ER) refers to the task description that is given to the language model along with
the definitions and examples(ERx) are randomly-selected examples from the existing dataset
which will be used to iteratively expand and reformulate the dataset.</p>
        <p>Example: Prompt used to generate sentences with event relation of type Enabling</p>
        <p>Note that the original dataset was re-annotated based on Timebank [20] and Event Causality
dataset [19], both of which are derived from news articles. This makes the majority of the
sentences falling within the political domain. Therefore, introducing the word political in the
prompt is to ensure that the generated sentences were coherent and consistent with the original
dataset domain.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Prompt-based Event Trigger Annotation</title>
        <p>Similarly to sentence generation, we leverage definitions of events to the prompt, adding few
examples illustrating the right position for event triggers for each relation types. The prompts
have been chosen to acquire the most similar sample pattern to facilitate parsing.</p>
        <p>For an event relationship ERx including prevention or intention, the prompt for selecting
their event trigger words is designed as follow:</p>
        <p>Prompt Event Triggers ERx= definition(Event) + definition(ER
x) + requesttrig(ERx, sentence,
xt1, xt2)
where the last element is the description of the task of retrieving event triggers from the text.
This request takes the following shape:</p>
        <p>If in this sentence &lt;TEXT OF THE SENTENCE&gt; is present an expression with
a &lt;RELATION TYPE&gt; relationship between &lt; 1 &gt; (trigger1) and &lt; 1 &gt; (trigger2),
what would be the trigger1 and trigger2 in these sentences? Give me only one single
word for each trigger an only two triggers per sentence. Put each pair between
parentheses in a separate line.</p>
        <p>For event relations of type enable, the definition of the condition is modified in the following
way:</p>
        <p>Prompt Event Triggers ERenable= definition(Event) + definition(Condition) +
definition(ER enable) + requesttrig(ERenable, sentence, xt1, xt2)
Example: Prompt used to generate event triggers with event relation of type Prevention</p>
        <p>An event is a possible or actual event, which can possibly be
defined by precise time and space coordinates.</p>
        <p>A condition is the fact of having certain qualities, which may
trigger events.</p>
        <p>The prevent relationship connects an event (trigger1) with the
event (trigger 2) for which is the cause of not happening.</p>
        <p>If in this sentence “Subcontractors will be ofered a settlement
and a swift transition to new management is expected to avert
an exodus of skilled workers from Waertsilae Marine’s two big
shipyards, government oficials said.” is present an expression
with a prevention relationship between settlement (trigger1)
and exodus (trigger2), what would be the trigger1 and trigger2
in these sentences? Give me only one single word for each
trigger an only two triggers per sentence, put each pair between
parentheses in a separate line.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Manual Validation</title>
        <p>We use these methods and we generate 600 sentences with each of the relations.</p>
        <p>To guarantee the accuracy of the generated set of samples and their appropriate event triggers,
we manually validate each synthetic sentence, ensuring its adherence to the given definition.
Overall, 90.77% of all generated sentences were correctly representing an event relation of the
requested type. After removing the wrong samples from the dataset, we proceed checking the
correctness of their extracted event triggers for the remaining correct sentences.</p>
        <p>The generated events triggers were not consistent in term of their patterns from one
generation to another. For this reason, an additional parsing step was needed. For doing that, we
identified the diferent textual patterns, processed and categorized these patterns by removing
irrelevant words such as ‘(trigger 1)’, and retaining only the precise word or sequence of words
that represent the essential part of the event. We were able to identify roughly 12 diferent
patterns. Some examples are reported in Table 3.</p>
        <p>After this processing, we validated the correctness of the two trigger words, measuring an
accuracy of 75.15% for trigger-1 and 66.82% for trigger-2. Sentences with wrong triggers were
not eliminated from the dataset, but instead manually fixed. Table 4 shows the detailed accuracy
scores for each relation type.</p>
        <p>We merged the synthetic data with our original dataset, resulting into a larger and more
diverse dataset. We managed to acquire and validate 1507 new sentences – with relative event
triggers – making a total 2289 sentences. The statistics of this new dataset – latter in the text
named the Augmented Dataset – are reported in Table 5.</p>
        <p>Based on the above reported results, it is evident that GPT-3, with a clear definition of
concepts, particularly definition of relations and their constructs types, along with a limited
number of examples (5 for each iteration), is able to reach a considerable accuracy. For event
triggers generation, we just included a single sentence example, along with its event triggers.
Observing the first generated annotations and realising that, despite the limited amount of
training data, the results obtained were reasonably good, we decided to continue without adding
further sentences. We considered the accuracy quite high, considering the dificulty of the task.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Events and Event Relation Extraction</title>
      <p>In order to fulfil our goal of testing the efectiveness of GPT-3 based data augmentation technique
on events and events relationships extraction, and at the same time, to evaluate the performances
of existing models, we conducted the following experiment. We fine-tune two instances of
the BERT model [15]. The first one, named BERTee, is for token classification which we apply
on event extraction. The second one, named BERTer, is for sequence classification which
we apply on events relation classification. We additionally fine-tune a variant of BERT for
event extraction, SpanBERT [17] taking into consideration that some of our event triggers are
represented by more than one word, i.e. a span of words, and SpanBERT is specifically designed
to handle this type of representation.</p>
      <p>Xi = [x1, xt1, …,xt2, xn] is a sentence of n tokens, which is part of the studied dataset.</p>
      <p>BERTee (Figure 1a) is trained for predicting a tag for each token in the input sequence. The
tags are chosen among TAGi=[O,x1type,…,x2type,O], where x1type=[Trigger1] and
x2type=[Trigger2] are the subject and the object of the event relation, and ‘O’ is refereed to the rest of tokens
in each sentence. The models consists of 12 transformer blocks receiving in input the sequence
of tokens Xi and returning the relative contextualized representation Hi= [h1, h2,…, hn]. On
top of the transformers, a classification layer – consisting of a fully connected layer followed by
a softmax activation – maps each contextualized representation ℎ to a probability distribution
over the possible labels for that token, i.e., Phi = [P(Trigger1|hi), P(Trigger2|hi), P(O|hi)]. Finally,
for xi ∈Xi, we select the most probable tag TAGxi = max(Phi).</p>
      <p>Similarly, BERTer (Figure 1b) takes as input the same sequence of tokens Xi, and uses
transformers to compute the contextualized representation Hi. This representation feeds a
softmax classification layer that maps the hidden states to the event relation types outputting the
probability distribution for the label Li ∈L, given L =[causality, enabling, prevention, intention,
No-Relation]. We select similarly the most probable label from the outputted labels probabilities.</p>
      <p>Some of the event mentions present in our work are denoted as a span of words, i.e a sequence
of words. Although they present the minority, we wanted to test a variant of BERT called
SpanBERT [17], which is trained mainly to predict a sequence of words rather than a single
word. During the training, the model is masking spans of words rather than random single words,
pushing the neural network to predict them. The model is similar to the overall architecture
of the BERTee in term of transformers blocks: base model with 12 transformer block. In
addition to the classification layer, SpanBERT also includes a span classification layer that is
designed to predict the label for a span of tokens. For example, given the sentence: “The United
Nations passed a resolution to impose an arms embargo on Syria in an efort to pressure
the government to end its civil war”. The span “arms embargo” is aimed to be tagged by
SpanBERT as a single class label, namely as “Trigger1 Trigger1” more robustly.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <sec id="sec-5-1">
        <title>5.1. Events and Events Relationship Extraction</title>
        <p>We trained BERTee, BERTer, and SpanBERT models on both the Original and Augmented
datasets. However, it is important to note that the test set used for evaluation was extracted
(a) BERTee for Event Extraction
solely from the Original dataset. In other words, synthetic data are used only in the training
phase, so that they are not distorting the evaluation of the developed systems.</p>
        <p>The outcomes of the experiment with BERTer are shown in Table 6. We observe that the
performance varied across diferent classes in the Original Dataset. The model showed relatively
higher F1 score values for the ‘cause’ and ‘enable’ classes, while it struggled to perform well on
‘intend’ and ‘prevent’ classes. This is probably due to the low support of these latter classes,
with intention relation being less represented in the dataset with only 44 example sentences.</p>
        <p>Data augmentation leads to significant improvements in the metrics, with an increment of
the F1-score for all classes. The highest improvement is observed in the ’prevent’ and ’enable’
classes, and in particular for the ’intend’ class, which received a 115.91% increase in the F1-score
with respect to the 44% of the Original Dataset. Despite the absence of added synthetic data for
the “cause” relation, we still notice an improvement with a modest margin. Future work will
involve data augmentation also for that class.</p>
        <p>The performance of BERTee and SpanBERT in events classification is less good in both the
Original and Augmented datasets (Table 7). Despite testing diferent parameters, the results
indicated that further improvements are still needed to enhance the performance of this task.
Therefore, the current outcomes should be viewed as preliminary, and further investigations
will be adopted as future work.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Modeling the Extracted Event Relations in a Knowledge Graph</title>
        <p>Event Knowledge Graphs are shown to be an efective data representation way to ease navigation
through event flows and their relations and to flexibly retrieve information about these events
from the stored knowledge [21]. This can serve many applications such as link prediction and
fact-checking, and their eficiency tends to be considerable when they are richer in terms of
aspects and relationships between them. For this sake, we aim to generate a Knowledge Graph
(KG) of events and relations between them from the Augmented dataset. In other words, this
KG will be an RDF version of the Augmented dataset.</p>
        <p>The KG that we constructed contains events and relations between them. The elements in
our KG are classified according to the FARO ontology, which distinguishes between two major
types of Relata: Condition and Event (see Section 3.1).</p>
        <p>More precisely, the events are typed according to the relation between them. ’Enables’
relation, was used to connect two entities in which the subject represents the (Condition) that
is necessary for the object (Event) to occur. We also identified the other relations that we focus
on in the previous parts between events in our KG, such as “causes”, “prevents”, and “intends”,
which relate two entities of type Event. These relations capture diferent types of causal and
temporal dependencies between events, and they allow to reason about complex chains of
events and their potential consequences.</p>
        <p>To ensure the traceability of our KG, we linked each event in our KG to the sentence it was
extracted from and each sentence to its provenance corpus, using the Provenance ontology
(PROV-O) [22]. This provenance information allows to track the origin of each piece of
information in our KG, and to verify its accuracy and relevance. When the provenance involves
scientific datasets or software, it is referenced in the graph using the FaBiO ontology [ 23],
detailing information about paper title, author and year of publication.</p>
        <p>
          Additionally, in order to have a more complete KG in terms of event relations (temporal
relations, comparative relations, etc.), we have incorporated in the same aforementioned way
events and their temporal relations from TimeBank [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] corpus, We also leverage events which
have temporal, comparative and contingent relations from the [24] dataset, we call it in the rest
of the paper Hong dataset.
        </p>
        <p>TimeBank consists of 24k events with 3.4k temporal links, extracted from 183 News article.
On the other hand, Hong consists of the annotation of ACE2005 news-wire documents and
other news documents about Malaysian Airline 17, resulting in 862 events with 25610 relations
between them.</p>
        <p>The integration of the earlier stated datasets was made after examining the overlap between
some of their event relation definitions with FARO ontology. The mapping of this relations to
FARO is shown in 8 and 9.</p>
        <p>The resulting graph – containing over 68,000 statements, with 11,917 event relation links –
has been loaded in a triplestore, available for query at http://kflow.eurecom.fr/.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion and Future Work</title>
      <p>In this work, we made a first attempt towards the automatic extraction of event relations with
precise semantics from raw text, focusing on a subset of them. We applied GPT-3 to generate
synthetic data for the aforementioned event relation types, obtaining good accuracy. The result
of this efort is a dataset consisting of 2289 sentences – of which 1507 were synthetic – annotated
with the event mentions in the text. The data augmentation method described in this paper
can be used to extract even more event relations by properly replacing the definition and the
examples to match the required relation types.</p>
      <p>Furthermore, we used BERT for performing two related tasks: event relation classification
and event mentions classification. We utilized also SpanBERT to evaluate its ability to classify
events that are expressed as a sequence of words into a single class, even though such events are
less frequent. The reported results show that the augmented dataset – and in general synthetic
data – improve the ability of the model to generalize and correctly classify sequences, even for
classes with a limited number of training examples in the original dataset. However, for event
mention classification the performance is still relatively low and needs further improvements.</p>
      <p>All code used for the experiments reported in this paper, as well as the resulting dataset is
available at https://github.com/ANR-kFLOW/event-relation-classification.</p>
      <p>With the recent release of GPT-4 [25], new experiments can be performed for improving the
synthetic data generation, in particular in the extraction of relevant triggers.</p>
      <p>
        Furthermore, we would like to investigate the interaction between event relation types
and the event mentions by jointly extract them from text in a sense of enhancing the event
trigger classification by leveraging event relation information. In particular, We plan to test
the efectiveness of the [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]model on our own dataset and assess its performance under similar
conditions.
      </p>
      <p>In future work, we intend to combine the automatic classification of event relation to classic
event identification techniques, in order to automatically annotate news and encyclopedic
entries, with the final goal of realizing a KG of interconnected events with precise semantics.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work has been partially supported by the French National Research Agency (ANR) within
the kFLOW project (Grant n°ANR-21-CE23-0028).
relation extraction, in: Twenty-Ninth International Conference on International Joint
Conferences on Artificial Intelligence, 2021, pp. 4032–4038.
[14] J. Yang, S. C. Han, J. Poon, A survey on extraction of causal relations from natural language
text, Knowledge and Information Systems 64 (2021) 1161 – 1186.
[15] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of Deep
Bidirectional Transformers for Language Understanding, in: Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language
Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics,
Minneapolis, Minnesota, 2019, pp. 4171–4186. URL: https://aclanthology.org/N19-1423.
doi:1 0 . 1 8 6 5 3 / v 1 / N 1 9 - 1 4 2 3 .
[16] S. Yang, D. Feng, L. Qiao, Z. Kan, D. Li, Exploring pre-trained language models for event
extraction and generation, in: Proceedings of the 57th Annual Meeting of the Association
for Computational Linguistics, Association for Computational Linguistics, Florence, Italy,
2019, pp. 5284–5294. URL: https://aclanthology.org/P19-1522. doi:1 0 . 1 8 6 5 3 / v 1 / P 1 9 - 1 5 2 2 .
[17] M. Joshi, D. Chen, Y. Liu, D. S. Weld, L. Zettlemoyer, O. Levy, SpanBERT: Improving
pre-training by representing and predicting spans, Transactions of the Association for
Computational Linguistics 8 (2020) 64–77. URL: https://aclanthology.org/2020.tacl-1.5.
doi:1 0 . 1 1 6 2 / t a c l _ a _ 0 0 3 0 0 .
[18] B. Portelli, D. Passabi, E. Lenzi, G. Serra, E. Santus, E. Chersoni, Improving adverse drug
event extraction with spanbert on diferent text typologies, ArXiv abs/2105.08882 (2021).
[19] Q. Ning, Z. Feng, H. Wu, D. Roth, Joint Reasoning for Temporal and Causal Relations,
in: 56ℎ Annual Meeting of the Association for Computational Linguistics, volume 1,
Association for Computational Linguistics, Melbourne, Australia, 2018.
[20] N. UzZaman, H. Llorens, L. Derczynski, J. Allen, M. Verhagen, J. Pustejovsky, SemEval-2013
Task 1: TempEval-3: Evaluating Time Expressions, Events, and Temporal Relations, in: 7ℎ
International Workshop on Semantic Evaluation (SemEval), Association for Computational
Linguistics, Atlanta, USA, 2013, pp. 1–9.
[21] S. Guan, X. Cheng, L. Bai, F. Zhang, Z. Li, Y. Zeng, X. Jin, J. Guo, What is event knowledge
graph: A survey, IEEE Transactions on Knowledge &amp; Data Engineering (5555) 1–20.
doi:1 0 . 1 1 0 9 / T K D E . 2 0 2 2 . 3 1 8 0 3 6 2 .
[22] K. Belhajjame, J. Cheney, D. Corsar, D. Garijo, S. Soiland-Reyes, S. Zednik, J. Zhao, PROV-O:</p>
      <p>The PROV Ontology, Technical Report, W3C, 2012. URL: http://www.w3.org/TR/prov-o/.
[23] S. Peroni, D. Shotton, FaBiO and CiTO: Ontologies for describing bibliographic resources
and citations, Journal of Web Semantics 17 (2012) 33–43. doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / j .
w e b s e m . 2 0 1 2 . 0 8 . 0 0 1 .
[24] Y. Hong, T. Zhang, T. O’Gorman, S. Horowit-Hendler, H. Ji, M. Palmer, Building a
Crossdocument Event-Event Relation Corpus, in: 10th Linguistic Annotation Workshop 2016
(LAW-X 2016), Association for Computational Linguistics, Berlin, Germany, 2016, pp. 1–6.
doi:1 0 . 1 8 6 5 3 / v 1 / W 1 6 - 1 7 0 1 .
[25] OpenAI, GPT-4 Technical Report, 2023. URL: https://arxiv.org/abs/2303.08774. doi:1 0 .
4 8 5 5 0 / A R X I V . 2 3 0 3 . 0 8 7 7 4 .</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>I.</given-names>
            <surname>Hendrickx</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Kozareva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Ó</given-names>
            <surname>Séaghdha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Padó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pennacchiotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Romano</surname>
          </string-name>
          , S. Szpakowicz, SemEval
          <article-title>-2010 Task 8: Multi-Way Classification of Semantic Relations between Pairs of Nominals</article-title>
          ,
          <source>in: Proceedings of the 5th International Workshop on Semantic Evaluation</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Uppsala, Sweden,
          <year>2010</year>
          , pp.
          <fpage>33</fpage>
          -
          <lpage>38</lpage>
          . URL: https://aclanthology.org/S10-1006.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Nasar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. W.</given-names>
            <surname>Jafry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Malik</surname>
          </string-name>
          ,
          <article-title>Named Entity Recognition and Relation Extraction: State-of-the-</article-title>
          <string-name>
            <surname>Art</surname>
          </string-name>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>54</volume>
          (
          <year>2021</year>
          ). URL: https://doi.org/10.1145/3445965.
          <source>doi:1 0 . 1 1</source>
          <volume>4 5 / 3 4 4 5 9 6 5 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.-T.</given-names>
            <surname>Vo</surname>
          </string-name>
          , E. Bagheri,
          <article-title>Extracting Temporal Event Relations Based on Event Networks</article-title>
          , in: L.
          <string-name>
            <surname>Azzopardi</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Fuhr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mayr</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Hauf</surname>
          </string-name>
          , D. Hiemstra (Eds.),
          <source>Advances in Information Retrieval</source>
          , Springer International Publishing, Cham,
          <year>2019</year>
          , pp.
          <fpage>844</fpage>
          -
          <lpage>851</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V.</given-names>
            <surname>Khetan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ramnani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Anand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sengupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Fano</surname>
          </string-name>
          ,
          <article-title>Causal bert: Language models for causality detection between events expressed in text</article-title>
          , in: K. Arai (Ed.),
          <source>Intelligent Computing</source>
          , Springer International Publishing, Cham,
          <year>2022</year>
          , pp.
          <fpage>965</fpage>
          -
          <lpage>980</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zuo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>Extracting events and their relations from texts: A survey on recent research progress and challenges</article-title>
          ,
          <source>AI</source>
          Open 1
          <article-title>(</article-title>
          <year>2020</year>
          )
          <fpage>22</fpage>
          -
          <lpage>39</lpage>
          . doi:h t t p s : / / d o i .
          <source>o r g / 1 0 . 1 0 1 6 / j . a i o p e n . 2 0 2 1 . 0 2 . 0 0 4 .</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Rebboud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lisena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Troncy</surname>
          </string-name>
          , Beyond Causality:
          <article-title>Representing Event Relations in Knowledge Graphs, in: Knowledge Engineering and Knowledge Management (EKAW</article-title>
          ), Springer International Publishing, Bolzano, Italy,
          <year>2022</year>
          , pp.
          <fpage>121</fpage>
          -
          <lpage>135</lpage>
          .
          <source>doi:1 0 . 1 0</source>
          <volume>0 7 / 9 7 8 - 3 - 0 3 1 - 1 7 1 0 5 - 5</volume>
          _
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N.</given-names>
            <surname>UzZaman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Llorens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Derczynski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Verhagen</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Pustejovsky,</surname>
          </string-name>
          <article-title>SemEval2013 task 1: TempEval-3: Evaluating time expressions, events, and temporal relations</article-title>
          ,
          <source>in: Second Joint Conference on Lexical and Computational Semantics (*SEM)</source>
          , Volume
          <volume>2</volume>
          :
          <source>Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval</source>
          <year>2013</year>
          ),
          <article-title>Association for Computational Linguistics</article-title>
          , Atlanta, Georgia, USA,
          <year>2013</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          . URL: https://aclanthology.org/S13-2001.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Mirza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tonelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Speranza</surname>
          </string-name>
          ,
          <article-title>Annotating causality in the TempEval-3 corpus</article-title>
          , in
          <source>: Proceedings of the EACL</source>
          <year>2014</year>
          <article-title>Workshop on Computational Approaches to Causality in Language (CAtoCL), Association for Computational Linguistics</article-title>
          , Gothenburg, Sweden,
          <year>2014</year>
          , pp.
          <fpage>10</fpage>
          -
          <lpage>19</lpage>
          . URL: https://aclanthology.org/W14-0702.
          <source>doi:1 0 . 3 1</source>
          <volume>1 5</volume>
          / v 1 / W 1 4
          <article-title>- 0 7 0 2</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>X.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Qin</surname>
          </string-name>
          , T. Liu, Y. Liu,
          <article-title>Efective deep memory networks for distant supervised relation extraction</article-title>
          ,
          <source>in: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>4002</fpage>
          -
          <lpage>4008</lpage>
          .
          <source>doi: 1 0 . 2 4 9 6 3 / i j c a i . 2 0</source>
          <volume>1 7 / 5 5</volume>
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          , Event Detection via Gated Multilingual Attention Mechanism,
          <source>in: AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>32</volume>
          ,
          <year>2018</year>
          , pp.
          <fpage>4865</fpage>
          -
          <lpage>4872</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ryder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Subbiah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhariwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Neelakantan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shyam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sastry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Askell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Herbert-Voss</surname>
          </string-name>
          , G. Krueger,
          <string-name>
            <given-names>T.</given-names>
            <surname>Henighan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Child</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ziegler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Winter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hesse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          , E. Sigler,
          <string-name>
            <given-names>M.</given-names>
            <surname>Litwin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chess</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Berner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>McCandlish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Sutskever</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Amodei</surname>
          </string-name>
          ,
          <article-title>Language Models are Few-Shot Learners</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>33</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2020</year>
          , pp.
          <fpage>1877</fpage>
          -
          <lpage>1901</lpage>
          . URL: https://proceedings. neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. Y.</given-names>
            <surname>Zakari</surname>
          </string-name>
          , G. Lu,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <source>Deep Neural Network Based Relation Extraction: An Overview, Neural Computing and Applications</source>
          <volume>34</volume>
          (
          <year>2022</year>
          )
          <fpage>4781</fpage>
          -
          <lpage>4801</lpage>
          .
          <source>doi:1 0 . 1 0 0 7 / s 0 0</source>
          <volume>5 2 1 - 0 2 1 - 0 6 6 6 7 - 3</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Modeling dense cross-modal interactions for joint entity-</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>