<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploration of Event Extraction Techniques in Late Medieval and Early Modern Administrative Records</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ismail Prada Ziegler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Digital Humanities, University of Bern, Switzerland Department of History, University of Basel</institution>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <fpage>761</fpage>
      <lpage>771</lpage>
      <abstract>
        <p>While an increasing amount of studies exploring named entity recognition in historical corpora are published, application of other information extraction tasks such as event extraction remains scarce. This study explores two accessible methods to facilitate the detection of events and the classification of entities into roles: rule-based systems and RNN-based machine learning techniques. We focus on a German-language corpus from the 15th-17th c. and property purchases as the event types. We show that these relatively simple methods can retrieve useful information and discuss ideas to further enhance the results.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;information extraction</kwd>
        <kwd>historical data</kwd>
        <kwd>digital history</kwd>
        <kwd>machine learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Among historical documents from the late medieval and early modern periods administrative
records are one of the most prevalent types of source material. These documents exhibit a
high density of information and often display some degree of standardisation within
collections. These traits make them ideal candidates for digital methods of information extraction
and analysis.</p>
      <p>However, applying digital information extraction techniques to historical documents
presents numerous challenges. Annotated historical datasets are limited both in size and in
number, and variations in grammar and spelling due to the lack of standardisation pose
significant obstacles. Despite these difÏculties, notable advancements have been made in the field
due to growing interest in digital history and digital humanities. An overview of recent studies
concerning named entity recognition can be found in3[].</p>
      <p>
        This paper contributes to this evolving field by presenting a case study on extracting event
information from historical land registers. In our projecEtconomies of Space we work to digitize
these registers and explore the potential of extracting information such as entities, relations
and events.1 Our goal is to create a knowledge base where the individual histories of persons,
properties, and organizations can be explored, as well as to enable distant reading methods
of analysis. In [
        <xref ref-type="bibr" rid="ref8">6</xref>
        ] we demonstrated that robust named entity recognition is possible for our
data. In this study, we explore the potential of event extraction as a first step to investigate
interactions between the found entities. We compare two methods: rule-based extraction and
RNN-based machine learning. While this case study focuses on a narrow example, we hope
that the findings of these experiments will benefit other teams working with similar datasets.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Dataset</title>
      <sec id="sec-2-1">
        <title>2.1. The Historical Land Registers</title>
        <p>The experiments were conducted with the Historical Land Registers of Base2l. This archival
collection aimed to bring together excerpts from all archival documents which mention a
property inside the old city of Basel. The content is a mix of legal and bookkeeping information
relating to property ownership, rents, and transactions. Our project focuses on 80,000 excerpts
from between 1400 and 1700, written in Early New High German3. Almost all excerpts are kept
to a single sentence, even when describing complex events. For the remainder of this paper,
the term ”sample” will refer to an individual document within this collection. The documents
were automatically transcribed with an average CER of 3.6%.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Entity Annotation</title>
        <p>640 samples were annotated following the BeNASch guidelines4. BeNASch applies a nested
entity representation, which means for each entity mention, a mention span (e.g., ’the house at
the river’, ’Hans Stuber, the tailor’) is annotated as well as ahead element (e.g., ’house’, ’Hans
Stuber’). All entity mentions that fall into one of the categories PER (persons), ORG
(organizations), LOC (locations), or GPE (Geo-political entities), including pronouns, are annotated.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Event Annotation</title>
        <p>The 640 samples also feature event annotation. We define an event as a ”specific occurrence
involving participants” following the ACE guidelines5. Only events that belong to categories
which were determined in our project to be of interest to historical research are annotated. An
event is characterized by two main elements: the trigger and the roles. The trigger represents
a word or phrase around which the event is centered. Roles match entity annotations and
describe the entities part in the event. See AppendixA for an annotation example.
2https://dls.staatsarchiv.bs.ch/records/1016781
3Although as is always the case with copied documents, we must suspect that at least in some cases modifications
and to some degree modernization of text took place. To answer the question ”to what degree?” is part of our
research project.
4https://dhbern.github.io/BeNASch/
5https://www.ldc.upenn.edu/sites/www.ldc.upenn.edu/files/english-events-guidelines-v5.4.3.pdf. Similar
guidelines have since been adopted in BeNASch.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Data and Evaluation</title>
        <p>For the purposes of this study, we focus on the event typeproperty purchase. It appears in
167 out of the 640 samples, making it comparatively frequent. We define the following roles
for the event property purchase: seller (PER or ORG), buyer (PER or ORG), property (LOC),
price (MONEY). Every role may appear multiple times, and only the property role must appear
at least once. The total occurrences of each role are shown in Table1. We implement 5-fold
stratified cross-validation because our dataset is still extremely small, especially for
machinelearning purposes. We split each fold 60/20/20% for training, validation and testing respectively.
The results represent the average across the five folds. This dataset still contains all other
eventannotated samples, but triggers and roles in those have been removed (we do this to evaluate
if our systems can distinguish property purchase events from other events as well).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Rule-based System</title>
        <sec id="sec-3-2-1">
          <title>3.2.1. Trigger Detection</title>
          <p>For each fold, we create a gazetteer of potential trigger phrases by counting the
triggerannotated phrases in that folds training-set. We exclude phrases which appear fewer than
k times from the gazetteer. We then compare this gazetteer to the input samples and apply
fuzzy-matching, using the thefuzz python library6, to mark one or multiple tokens as triggers.
We allow our algorithm to detect multiple trigger phrases in a single sample. For each fold,
we determine the minimum ratio for the fuzzy matching as well as the minimum frequencyk
by running diferent parameter combinations against the validation set and choosing the best
result. The best parameters were either setting k to 3 and minimum fuzz-ratio to 0.8 or setting
both parameters to 1.</p>
          <p>We avoid some frequent problems with additional rules: 1. To prevent misidentification of
the word ”Kauf” in documents titled ”Kauf-Urkunde” (purchase deed), we forbid the first token
in a document to match a trigger. 2. Triggers may only match words outside of entity
mentions, this prevents for example ”verkauft” in ”das verkauft Haus” (the sold house) from being
identified as a trigger. 3. If one trigger follows another trigger without any entity mention in
between, we remove the second one (e.g. ”es verkauft und gibt zu kaufen”). While these kinds
of errors don’t have a negative impact on the document classification or role detection, they
distort the trigger detection scores to look more negative than they actually are. 4. In some
cases two predicted triggers are separated by a person or organization mention. We reclassify
the second trigger as a helper in that case. They are helpful information in the role detection
and their use will be explained in the next section. 5. Finally, we remove triggers where a
MONEY or TIME annotation is found between the trigger and a LOC. These cases indicate rent
purchase documents which are very similar in language and structure to property purchase
documents.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Role Detection</title>
          <p>To detect roles, we apply a simple template system whenever a trigger is present. For
property purchase documents, we identify three diferent kinds of structures (ignoring
non-entitymentions and non-values):
Templates 1 and 2 are usually found when the sale is the central event of the excerpt, while
template 3 usually follows a seizure event, giving information who bought the property after
it was seized and auctioned of. Sometimes roles are missing from the text, so we only require
a trigger and at least one LOC-mention to apply a template. The template used is decided by
looking at the diferences: A LOC before the trigger implies template 3, otherwise check if a
PER/ORG is present before the TRIGGER, if yes then template 1 is used, otherwise template
2. We can match mentions to roles due to the restrictions in their categories, as long as their
position relative to the other roles and the trigger is correct: PER and ORG can only be SELLER
and BUYER, while LOC can only be PROPERTY and MONEY can only be PRICE. One challenge
is the distinction between SELLER and BUYER in template 2. To show what can already be
done by simple means in this case study, we solve this by putting the first half of all PER/ORG
mentions as SELLER and the second half of all PER/ORG mentions as BUYER (in case of an
odd number of candidates, SELLER gets the additional one). If a helper-trigger is present, we
use it to distinguish buyer and seller.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Machine-Learning System</title>
        <sec id="sec-3-3-1">
          <title>3.3.1. Architecture</title>
          <p>
            Our approach to event extraction by machine learning is inspired by previous successes to
extract entities in pre-modern German texts [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ][
            <xref ref-type="bibr" rid="ref8">6</xref>
            ]. Like entities, we can model roles and trigger
as annotation spans in the text and apply a sequence tagging strategy (this is one common way
to model event extraction [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ]).
          </p>
          <p>
            We implement our experiment using the FlairNLP framework [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ]. For the language model,
we stack a forward and backward model of contextual character embeddings2[] which we
obtained by finetuning the de-model on all handwritten documents in the Historical Land
          </p>
          <p>
            Registers, including later than 1700 (appr. 9.14M token). Character-based embeddings have
demonstrated robustness against the inherent variability of pre-modern German spelling and
vocabulary [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ]. For the event extraction, we train a sequence tagging model with the default
settings of Flair (single-layered Bi-LSTM + CRF decoder).
          </p>
        </sec>
        <sec id="sec-3-3-2">
          <title>3.3.2. Pretagging</title>
          <p>To insert the information from the named entity annotation into the model, we add a prefix
and sufÏx token to each entity mention. ”Hans sold his house .” becomes ”[B-PER] Hans
[EPER] sold [B-LOC] his house [E-LOC] .” For experiments focused only on role detection, we
incorporate trigger information in the same manner. (”[B-SALE] sold [E-SALE]”).</p>
          <p>We conduct experiments with manually annotated tags as well as automatically predicted
ones. The predicted annotations are trained as a Flair SequenceTagger as well, using the same
language model as the event recognition. A separate model is trained for each fold so no data
contamination occurs.</p>
          <p>When pretagging is applied, the role detection is not required to match the whole span of
the pretagged entity, instead it is trained to classify the prefix token (e.g. ”[B-PER]”) correctly.
The training data is adjusted accordingly (see AppendixB for an example).</p>
        </sec>
        <sec id="sec-3-3-3">
          <title>3.3.3. Variants</title>
          <p>
            Shortening: To shorten our samples and possibly remove noise, we remove all tokens inside
entity annotations which are not part of thehead. This reduces our sample length by an average
of about a third of all tokens. The NER models for this variant of pretagging were trained as
described in [
            <xref ref-type="bibr" rid="ref8">6</xref>
            ].
          </p>
          <p>Document Filtering: Because the system in initial tests often annotated roles in documents
even when they (correctly) identified no trigger, we added a rule to disregard all role
annotations in documents where no trigger is present.</p>
          <p>Note that these variants do only apply to the machine learning strategy. Shortening is
irrelevant to the rule-based strategy because trigger detection only happens outside of entity
annotations and role detection is based only on entity annotations positions and classes, not
content. Document Filtering doesn’t apply because only documents that contain triggers will
be further processed.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results &amp; Discussion</title>
      <sec id="sec-4-1">
        <title>4.1. Experimental setup</title>
        <p>We experiment with four base settings for the machine learning method:
• Pretagged training and test-sets contain named entities retrieved from our ground truth
dataset.
• PredNEsTest training-set with pretags from the ground truth, but a test-set with
automatically predicted NE-annotations. This represents the practical scenario for our
project, but is highly dependent on the quality of the NER model.
• PredNEs automatically predicted entities in both training and test-set. We test this setup
to see if training on noisy entity mentions improves the models robustness during testing
when encountered with similar noise.</p>
        <p>• Plain no pretagging.</p>
        <p>Additionally, we test variants adding theshortening augmentation (+Shortening) anddocument
iflter (+DocFilter). For the rule-based system, we report two setups, one using the ground
truth entity mentions and one using automatically predicted entity mentions (analogous to
PredNEsTest).</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Rule-based vs. Machine Learning</title>
        <p>Table 2 shows that the machine learning systems significantly outperform our rule-based
systems no matter if tags are generated from ground truth information or are automatically
predicted. Interestingly, the trained model without any pretaggingP(lain) still performs similar
in role detection compared to a rule-based system working with pretagging information.</p>
        <p>In Table 3 the results between the respective best models are shown per category. We
observe that the machine learning system is slanted heavily to achieve high precision values. The
document filter rule has part in this, reducing recall by appr. one percentage point, but also
without filter, a significant slant towards precision remains. Depending on the use of the
annotations, this may be problematic. Especially when the annotations are used as a tool to find
interesting data points, which are then manually investigated, false positives would likely be
less problematic than false negatives.</p>
        <p>In a more thorough review of the errors found in the machine learning predictions
(specifically Pretagged+Shortening+DocFilter) and the rule-based predictions, we observe three main
points that the machine learning system is able to handle better:</p>
        <p>First, in our dataset, people or organizations represented by someone else are not annotated
as taking part in the event (their connection to the event is handled in the form of a relationship
roles /w gt trigger roles /w pred trigger
Rule-based
Pretagged
Pretagged+DocFilter
Pretagged+Shortening
Pretagged+Shortening+DocFilter
Rule-based with PredNEs 0.8586 ± 0.0478
PredNEsTest+Shortening+DocFilter 0.8799 ± 0.0256
PredNEsTest+DocFilter 0.8717 ± 0.0219
PredNEs+DocFilter 0.8414 ± 0.0650
Plain 0.8224 ± 0.0709
n/a
n/a
n/a
n/a
n/a
trigger
between them and the person representing them). E.g. ”Es verkauft Hans Vöglin innamen
seines bruders kinder” (Hans Vöglin sells in the name of his brothers children...) only classifies
”Hans Vöglin” as seller, but not ”kinder”. Our rule-based system does not contain a rule to
ignore these mentions when looking for the roles ”seller” and ”buyer”. Writing rules for these
cases isn’t trivial either, as phrasing and spelling of words indicating these occurrences varies.
The machine learning system was able to correctly ignore these mentions in the examples we
investigated manually. Second, as already expected in the methodology section, the rule-based
system struggles with the misidentification of buyer as seller, and conversely, seller as buyer.
We observe that the machine learning system reduces the amount of errors of this kind by two
thirds. Finally, we observe a remarkable diference when it comes to slightly altered phrasing in
the documents. While the machine learning system still fails when confronted with completely
foreign structures (such as a property purchase being discussed as a past event in the middle
of a rent purchase), it can handle small alterations quite well.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Learning Curve Analysis</title>
        <p>Figure 2 illustrates the performance of the machine learning system (+Shortening+DocFilter)
compared to the rule-based system. We observe that using around 40% of the training material
(appr. 54 samples) will result in role annotations comparable to the rule-based system, while
using 50% will achieve significantly better results. As usual for machine learning systems, the
increase in performance lessens with increasing sample size.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Impact of Variants</title>
        <p>The Shortening augmentation improves the scores in all settings where it was applied. The
strongest diference could be observed when evaluating roles with predicted triggers (p-value
= 0.0131). During error analysis, we found that the removed tokens can also result in a loss of
relevant information. Specifically, clauses where a husband is named in conjunction with his
wife, e.g. ”Es verkaufen Hans, seine Frau Anna...” (Hans and his wife Anna sell...) which would
get shortened to ”Es verkaufen Hans, Anna...”, followed by the names of the sellers, would
sometimes result in the misidentification of the wife as a buyer, while the un-shortened model
would classify these instances correctly. We thus see the shortening strategy as a success with
further research required for more fine-grained variants, e.g. only shortening mentions when
certain conditions are met.</p>
        <p>The document filter rule worked well and improved the results over the board. When
shortening is not applied at the same time, we observe a significant improvement (p-value =
0.0316). Otherwise we still observe a positive trend (p-value = 0.0518).</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Practical usability</title>
        <p>Our final evaluation of this system in a practical use case scenario is mixed. On one hand, the
system produces annotations that may well be used to create data of larger quantities where
general trends can be observed. An example for possible analysis could be to combine the
role annotations with the nested entity annotation to observe economic interactions between
occupational groups over time. On the other hand, the systems show a - larger or smaller,
depending on the method applied - amount of bias of only finding the events when the structure
of the document fits one of the three main templates. So any conclusions drawn from the
predicted event information need to consider this bias with caution.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this case study, we’ve shown that even with relatively simple means, we can achieve
automated annotations which are usable in historical research. The scope of this case was
intentionally kept small to simplify evaluation and interpretation, but future research in our project will
explore how these systems perform across a broader range of event types. Most event types
that occur with sufÏcient frequency for machine learning are of similar structural
homogeneity to the documents in this study. Therefore we assume the findings for property purchases
will also be applicable to other event types. We also aim to explore how transfer learning can
benefit event recognition for less frequent event types. We’ve shown that with our kind of data,
a machine-learning system can outperform a rule-based system by a significant margin even
when when only little training data is available. Writing rules may be quicker than annotating
documents still, but considering both systems rely on pretagged texts, the amount of necessary
work can probably be reduced significantly if events and entities are annotated at the same time.
When working with any data annotated by these methods, knowledge of the bias that is
inherent to them is crucial. For example, the samples which do not fit the main templates might
be coming from a very specific source, which would lead the automated system to miss most
documents from that specific source, which would distort whatever conclusions we’re trying
to draw from the quantitative results. But this study is only the first foray into event extraction
in historical texts and only looked at two quick-to-implement and easily accessible methods.
In future research the possible application of LLMs to this task should be investigated, as LLMs
have shown to perform well in low-resource scenarios 7[], but their applicability to historical
German must be evaluated first.</p>
      <p>A. Akbik, D. Blythe, and R. Vollgraf. “Contextual String Embeddings for Sequence
Labeling”. In: Proceedings of the 27th International Conference on Computational Linguistics.
Santa Fe, New Mexico, USA, 2018, pp. 1638–1649.</p>
    </sec>
    <sec id="sec-6">
      <title>A. Event Annotation Example</title>
      <p>&lt;event:sale&gt; Gend ze &lt;trigger&gt; kaufen &lt;/trigger&gt; &lt;seller&gt; Heinrich Trech von Lauezhut
der Kremer &lt;/seller&gt; u &lt;seller&gt; Margareth Lang Walcherin sin ewirtin &lt;/seller&gt; , &lt;buyer&gt;
Blesin Winsperg dem schnider&lt;/buyer&gt; u . &lt;buyer&gt; Margarethen siner ewirtin &lt;/buyer&gt; ,
&lt;property&gt; daz Hus u . Hofstatt genant zer Thannen , so gelegen als man von dem Vischmergt
heruf zem Sunfegen gat [...] , ist erb von dem gotshus Lienh denen jährl darab gand 3 lb 21 lot
pfefer ze wysung &lt;/property&gt; um &lt;price&gt; 150 fl . &lt;/price&gt; &lt;/event:sale&gt;</p>
      <p>appr. english translation: Give to buy Heinrich Trech of Lauezhut the trader and Margareth
Lang Walcherin his wife, Blesin Winsperg the taylor and Margareth his wife, the property called
zer Thannen, lies when you go from the cattle market to zum Sunfegen [...], is owned by the church
St. Lienhart which is paid 3 lb 21 lot of pepper for 150 lb .</p>
    </sec>
    <sec id="sec-7">
      <title>B. Ground Truth Example With Pretagged Text (BIO-Format)</title>
      <p>Role
O
O
B-Trigger
B-Seller
O
O
O
B-Property
O
O
O
O
O
O
B-Price
O
O</p>
      <p>O</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Akbik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Bergmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Blythe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Rasul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schweter</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Vollgraf</surname>
          </string-name>
          . “
          <article-title>FLAIR: An Easyto-Use Framework for State-of-the-</article-title>
          <string-name>
            <surname>Art</surname>
            <given-names>NLP</given-names>
          </string-name>
          ”.
          <source>In:Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations).</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Minneapolis</surname>
          </string-name>
          , Minnesota,
          <year>2019</year>
          , pp.
          <fpage>54</fpage>
          -
          <lpage>59</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -4010.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hamdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Pontes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Romanello</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Doucet</surname>
          </string-name>
          . “
          <article-title>Named Entity Recognition and Classification in Historical Documents: A Survey”</article-title>
          .
          <source>In: ACM Comput. Surv. 56.2</source>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .1145/3604931.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hodel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. Prada</given-names>
            <surname>Ziegler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Schneider</surname>
          </string-name>
          .
          <article-title>Pre-Modern Data: Applying Language Modeling and Named Entity Recognition on Criminal Records in the City of Bern. Presented at the Digital Humanities 2023</article-title>
          .
          <article-title>Collaboration as Opportunity (DH2023), Graz</article-title>
          , Austria.
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .5281/zenodo.8107616.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Beheshti</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <article-title>“A Survey on Deep Learning Event Extraction: Approaches and Applications”</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>In: IEEE Transactions on Neural Networks and Learning Systems 35.5</source>
          (
          <issue>2024</issue>
          ), pp.
          <fpage>6301</fpage>
          -
          <lpage>6321</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>doi: 10</source>
          .1109/tnnls.
          <year>2022</year>
          .
          <volume>3213168</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>I. Prada</given-names>
            <surname>Ziegler</surname>
          </string-name>
          .
          <article-title>What's in an entity? Exploring Nested Named Entity Recognition in the Historical Land Register</article-title>
          of Basel (
          <volume>1400</volume>
          -
          <fpage>1700</fpage>
          ).
          <source>Presented at the Digital Humanities Benelux</source>
          <year>2024</year>
          , Leuven, Belgium.
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .5281/zenodo.11500543.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ouyang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Wang.</surname>
          </string-name>
          GPT-NER:
          <article-title>Named Entity Recognition via Large Language Models</article-title>
          .
          <source>preprint arXiv 2304.10428</source>
          .
          <year>2023</year>
          . arXiv:
          <volume>23</volume>
          <fpage>04</fpage>
          .10428 [cs.CL].
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>