<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>EVALITA</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Simple Ideas at CLinkaRT: LeaNER and MeaNER Relation Extraction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marius Micluța-Câmpeanu</string-name>
          <email>marius.micluta-campeanu@unibuc.ro</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Liviu P. Dinu</string-name>
          <email>liviu.p.dinu@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Mathematics and Computer Science, University of Bucharest</institution>
          ,
          <country country="RO">Romania</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Human Language Technologies Research Center, University of Bucharest</institution>
          ,
          <country country="RO">Romania</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Processing and Speech Tools for Italian</institution>
          ,
          <addr-line>Sep 7 - 8, Parma, IT</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>two consecutive Named Entity Recognition (NER) mod-</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>8</volume>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>In this paper, we present our approach for performing relation extraction on clinical texts in the context of the CLinkaRT task at EVALITA 2023. Our system ranked first in this task with an F1-score of 62.99, outperforming most other submissions by a significant margin, with an increase of 6.5% over the second best score of 59.16, while also improving over the mBERT baseline of 62.83. We pursue a simple yet unexplored method to determine sentence level relations in text by relying on Named Entity Recognition models to identify the components of a relation. We apply this method to link laboratory results to their appropriate events in medical reports.</p>
      </abstract>
      <kwd-group>
        <kwd>EVALITA</kwd>
        <kwd>CLinkaRT</kwd>
        <kwd>named entity recognition</kwd>
        <kwd>relation extraction</kwd>
        <kwd>transformers</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>The availability of vast quantities of textual data in the</title>
        <p>
          biomedical domain from digital repositories like PubMed
Central has led to the development of highly specialized
resources and language models [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Nonetheless, most of
these eforts have been focused on English, while other
less-resourced languages were largely neglected due to
lack of available datasets.
        </p>
        <p>
          The typical approach for downstream tasks in these
languages is to resort to multilingual models, such as
mBERT [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The rising need for pretrained models in
languages other than English for biomedical applications
materialized in the past few years with the advent of
BioBIT/MedBIT for Italian [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and similar models for
other lower-resource languages: Spanish [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], Turkish [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]
and French [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>
          In the context of creating better systems for Italian, the
CLinkaRT shared task [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] at EVALITA 2023 [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] challenges
participants to detect laboratory measurements and tests
from clinical records in order to associate them with
their corresponding results. The relevance of developing
and improving relation extraction tasks is highlighted
in the literature, since it provides the underlying core
elements for creating advanced biomedical text mining
systems. Some examples include discovering interactions
between drugs, adverse efects, genes, chemicals and
diseases; predicting inappropriate emergency room visits;
generating educational documents; building interaction
nEvelop-O
a similar system for our submission in the TESTLINK
twin task [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], we provide a shortened description of
the implementation and focus more on experiments and
ifndings carried specifically in the context of CLinkaRT.
2.2. Implementation details
(5) “Gli esami colturali (germi comuni, [T] BK)
risultavano negativi.”
The first NER model is trained to predict all target entities
in a sentence. For instance, in the phrase “La creatinina We encode annotations for both NER models with
oscillava tra 1,5–2 mg/dL con proteinuria sempre &lt; 1 standard IOB2 tags (inside, outside, beginning) for either
g die” there are two relations: “creatinina” target with sources (RML entities) or targets (EVENT entities). A
“1,5–2 mg/dL” as source and “proteinuria” with “&lt; 1 g regular relation extraction pipeline would first employ a
die” as source. We begin by locating targets first because NER model to determine sources and targets at the same
the annotations mark only the syntactic head of a target, time and then apply a relation classifier on all possible
e.g. esami “tests” is an appropriate target for both esami source-target pairs.
colturali “culture tests” and esami ematici “blood tests”. With our approach, the first NER model is tasked to
        </p>
        <p>After determining all targets in a sentence, we trans- predict just target entities, while the second NER model
form the training examples to incorporate target loca- is trained solely with labels for source entities. The
contions directly in text by adding a special marker token sequence is that our models have a lower number of
[T] before each target token, which should help the sec- possible labels, determined by fewer IOB2 tags, therefore
ond NER model find relevant source entities. This is a improving prediction performance. Each model has 3
viable strategy to denote one-to-one, one-to-many and tags: beggining, inside, outside. While the NER model
many-to-one relations between sources and targets, thus used to predict targets only denotes the target head, we
efectively eliminating the need of a relation classifier still need “inside” tags due to sub-word splitting required
model. The target indicates just the syntactic head, so we by transformer models. Contrast this with a traditional
do not add an end marker because it might hinder the sec- pipeline that would have a NER model with 5 tags (2
ond NER model’s ability to properly learn representing beginning tags, 2 inside tags, 1 outside tag) followed by
adequate targets. a relation classifier model.</p>
        <p>
          All relation types (one-to-one, one-to-many,
many-toone) are handled in a uniform manner. For every target in 2.3. Data augmentation
a sentence, we generate one sample with a single target
marker [T]. In this regard, there is no diference between The CLinkaRT training dataset consists of 83 Italian
doca source with multiple targets and several one-to-one uments with 658 annotated relations. Due to the
limrelations. For one target with many sources, only a single ited number of examples, we decide to augment the
iniexample is created. This way, we augment our training tial dataset with contextual word embeddings using the
data for the second NER model. nlpaug library [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. For this process, we replace random
        </p>
        <p>As an example with two one-to-one relations, the sen- words with other similar words in the embedding space,
tence “La creatinina oscillava tra 1,5–2 mg/dL con protein- except for labeled tokens, since there is a risk of injecting
uria sempre &lt; 1 g die” has two targets, “creatinina” and noisy labels.
“proteinuria”. The samples for our second NER model We preserve the annotated entities and the target
will be: marker [T] in the augmented examples, ignoring
sentences with 9 words or less. Samples that are not identical
(1) “La [T] creatinina oscillava tra 1,5–2 mg/dL con in terms of word count are discarded because the original
proteinuria sempre &lt; 1 g die” with only “1,5–2 labels would be misplaced.</p>
        <p>mg/dL” labeled as source Given that one of our main concern is finding
labora(2) “La creatinina oscillava tra 1,5–2 mg/dL con [T] tory tests and measurements, many of these entities are
proteinuria sempre &lt; 1 g die” with only “&lt; 1 g numeric values. To further our data augmentation, we
die” labeled as source introduce tiny changes of ±2 for decimal values (age, year
or quantities with a higher tolerance) and ±0.1 for real</p>
        <p>In the following example, there is one source linked values (tests or percentages). In most cases, this process
to three targets: “Gli esami colturali (germi comuni, BK) should not significantly alter the labeling.
risultavano negativi.”. The source is “negativi”, while the The training set has a much greater number of
negatargets are “esami”, “germi” and “BK”. Three sentences tive samples (examples without relations) than positive
will be added, all with a single source to be predicted samples. We augment each example one or more times,
(“negativi”): with positive instances denoted by in_multiplier and
(3) “Gli [T] esami colturali (germi comuni, BK) risul- negative instances by out_multiplier. We use the
multavano negativi.” tipliers shown in Table 1, where the NER-tgt model
predicts targets and the NER-src model predicts sources. The
(4) “Gli esami colturali ([T] germi comuni, BK) risul- second model requires fewer auxiliary examples because
tavano negativi.” the preprocessing step created additional samples for
Multiplier type
in_multiplier
out_multiplier</p>
        <p>NER-tgt</p>
        <p>NER-src
sentences with more target entities.</p>
        <p>
          The bottleneck of this augmentation process is the li- for selecting appropriate values for some of the
parambrary call that executes the transformation. Considering eters regarding training and augmentation. Although
that the operation runs on GPU, it should be natural to the models are trained at the sentence level, this split
attempt to speed up this step by augmenting several ex- is by document id so we do not overestimate the model
amples in parallel. While the nlpaug library has an API performance on unseen examples.
that allows augmenting multiple sentences at once and The main results for relation extraction in the
at first it appeared to work on a few samples in the train CLinkaRT task are displayed in Table 2. Our team obtains
set, a significant number of augmented examples con- the first place across all teams, with an F1-score of 62.99,
structed by nlpaug turned out to be empty sentences due an improvement of 6.5% in F1-score over the second best
to limitations or issues of this library. The batch imple- competing team of 59.16 and a slight increase over the
mentation required a bit of efort due to the need to apply mBERT baseline with a score of 62.83. We also achieve
diferent multipliers. Since this attempted optimization the highest recall of 60.62 among other participants, with
did not succeed, we resumed augmenting examples one the second best score of 50.65, while the mBERT baseline
by one. has a recall of 64.37. We improve the baseline precision
of 61.37 with a 6.8% increase, reaching a score of 65.55.
2.4. Model training and inference We report the performance of our system on the
validation set averaged across 10 folds together with the
We implement our NER models as standard token clas- oficial results. We used cross-validation to carry out
sifiers with the help of HuggingFace Transformers li- parameter and model selection. Besides these 10%
rebrary [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. We perform fine-tuning on a model pre- served examples for testing, the models also set aside
trained on Italian medical textbooks, web-crawled data 10% of the remaining examples for validation and
hyperand translated English PubMed abstracts, available as parameter tuning. In spite of these eforts, we notice a
IVN-RIN/medBIT-r3-plus on HuggingFace Hub [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. possible tendency of overfit. One explanation for this
        </p>
        <p>All of our training experiments are carried out by phenomenon is the small size of the model validation set,
mostly preserving default parameters: AdamW optimizer with too few samples to properly adjust the parameters
with 5 −5 learning rate with linear decay and no warmup when training. Another related explanation is given by
steps, 1 −2 weight decay, 8 samples per batch trained for the high variation between some of the folds, with half
4 epochs, with 10% examples held out for validation. Both of the folds obtaining F1-scores over 82%, while the other
models are trained independently using gold labels, with folds consistently scored lower, between 73% and 77%
the second NER model (NER-src) receiving the target F1-score. Even so, it might simply be the case that the
marker tokens [T] from these gold annotations. test set is intentionally constructed with novel situations</p>
        <p>
          In inference mode, the models are asked to output to determine the performance on unseen data more
acsource and target ofsets with respect to the original raw curately, which would justify the gap between test and
text. For each sentence converted into an example, we validation.
store the ofset of the first token. Since HuggingFace Outside the evaluation window, we repeated the
inDatasets library employs a separate tokenization, we ference process a second time on the test set keeping
align the transformed concatenated sentences with the the same parameters and achieved 64.09 F1-score,
showinitial full texts by using spaCy [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. ing that our approach can outrun the other systems by
a greater margin than in the oficial results. Still, this
3. Results variation is caused by the nondeterministic nature of
transformer networks. We plan to analyze the extent of
this variation and to limit the randomness of our system.
        </p>
        <p>
          We conduct our experiments by creating a test set from
the training set with a 90/10 split in order to simulate
the final test set, switching to 10-fold cross-validation
points. The results of the multilingual model are
averaged only over the first 5 folds (out of 10 folds) because
the training process takes much longer. We believe the
increased training time is not justified given that a much
smaller model can achieve better results. For this reason,
we do not conduct additional experiments with
multilingual transformers. Concerning the BioBIT model, it
ofers slightly worse results than the MedBIT-R3-plus
counterpart, as noted likewise in the original paper [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>The results are summarized in Table 3, using data
augmentation for all variants except where noted otherwise.</p>
        <p>Due to time constraints, we did not run additional
experiments with these models.</p>
        <p>In addition to typical false positives and false negatives,
we observe types of errors that show the system is on
the right track, but fails to output the exact ofsets in the
reference file.</p>
        <p>There are a couple of incomplete entity spans:
MedBIT-R3-plus
MedBIT-R3-plus (no aug)
Italian BERT
DistilBERT
BioBIT</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Discussion</title>
      <sec id="sec-2-1">
        <title>In this section, we present some observations regarding</title>
        <p>the design choices of our implementation and conduct
an error analysis.</p>
        <p>The data augmentation process has three main
parameters: the minimum number of words that should be
replaced in a sentence, min_aug, and the two multipli- (6) The true source is “pari 0 inferiori a 1.5 mg/dl”
ers for positive and negative examples defined earlier as and the predicted source is “inferiori a 1.5 mg/dl”.
in_multiplier and out_multiplier. We pick values Other similar examples: true source is “fino a 12.8
between 3 and 6 for min_aug, based on the number of mg/dL”, predicted source is “12.8 mg/dL”; true
failed replacements, fixing the value at 4 words. The source is “punte di circa 1200 pg/mL”, predicted
reasoning behind this decision is that the augmentation source is “circa 1200 pg/mL”.
library is sometimes unable to adequately generate valid
examples due to misplaced or missing words, so the gold (7) The true target is “antitrombina”, while the
prelabels cannot be applied, in turn leading to fewer exam- dicted target is “anti”
ples in the train set. The first situation appears due to modifying
compara</p>
        <p>The multipliers are selected by cross-validation, stop- tives not being present or being scarcely existent in the
ping early in case of unsatisfactory results on the first training data, which one could solve with additional
exfolds. For out_multiplier, we vary this parameter amples or through careful augmentation. The second
between 0 and 2 for both NER models, while for in_- issue seems to be a defect on our side that can be handled
multiplier we use values in the range 1–5 for NER-tgt in post-processing by inspecting the initial tokenization.
and 1–4 for NER-src. Our experiments confirm that aug- Another common mistake is the prediction of one
rementation is also needed for negative samples. This step lation instead of two (or vice versa) in the case of
interhas a significant impact in our system, boosting the score vals, which we explain by ambiguities in the training set.
on the validation set with over 20 percentage points in For example, our system outputs “1.9 – 2.5 mg/dl” linked
F1-score. As it would be expected, adding too many ex- with “creatininemia”, but there are two expected relations:
amples by using larger multipliers eventually induces “1.9” linked with “creatininemia” and “2.5 mg/dl” linked
overfit. with “creatininemia”. Conversely, our system detects two</p>
        <p>The main drawback of augmentation is the slow data relations, “sostanzialmente” linked with “obiettività” and
generation. As we mentioned earlier, nlpaug runs se- “nei limiti di norma” linked with “obiettività”, while there
quentially, so we had to make educated guesses of what is only one true relation, “sostanzialmente nei limiti di
combinations of parameters to include in our experi- norma” linked with “obiettività”.
ments. Lastly, a challenging facet of this task is the presence</p>
        <p>
          For most of our experiments, we rely on the model of reference values for some tests, which are picked up
called MedBIT-R3-plus [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] accessible on HuggingFace by our model, although they are not found as gold
laHub. In order to determine if this is the right choice, we bels because they do not represent test results. Future
briefly examine the efectiveness of other transformer work in this direction should find means to distinguish
models. We consider three alternative options: Italian between reference values and actual measurements and
BERT [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], the multilingual version of DistilBERT [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] test values.
and the BioBIT model trained only on medical textbooks
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Undoubtedly, Italian BERT is less suitable for this
task, with a substantial drop in F1-score of 14 percentage
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Conclusion and future work</title>
      <sec id="sec-3-1">
        <title>In this paper, we detailed our contribution in the</title>
        <p>
          CLinkaRT task [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] at EVALITA 2023 [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], demonstrating
that intuitive solutions yield competitive results. Our
proposed approach achieves the best F1-score among other
systems in the task of correlating laboratory tests and
measurements with their results, with a 6.5%
improvement in F1-score over the second best contestant.
        </p>
        <p>We present a straightforward strategy to extract
sentence-level relations based on two plain NER
models, illustrating the learning capabilities of transformer
networks to solve challenging tasks with the help of
special tokens. We intend to further explore this direction
since NER models are well established and usually
require fewer resources than alternative relation extraction
(RE) models. The presented method is not limited to
the clinical domain and it can be easily applied in other
contexts, with the added benefit of shorter development
cycles. In certain domains and applications, the overhead
of a generic RE model may be unjustified if the relations
in question are simple enough.</p>
        <p>Data augmentation is a valuable, but underused
technique in natural language processing contexts. We look
forward to enhancing the augmentation procedure to
account for in-domain information. Another area we
believe to be worth pursuing is the handling of numeric
values and ranges, either by finding a way to inject fuzzy
intervals or by masking these values altogether, therefore
simplifying the initial problem.</p>
        <p>Our system implementation is available at https://gitl
ab.com/marius.micluta-campeanu/testlink-clinkart-2023
to encourage an open environment for future work.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>I.</given-names>
            <surname>Beltagy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lo</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Cohan,
          <article-title>SciBERT: A pretrained language model for scientific text</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Hong Kong, China,
          <year>2019</year>
          , pp.
          <fpage>3615</fpage>
          -
          <lpage>3620</lpage>
          . URL: https://aclanthology.org/D19-137 1. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          - 1371.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . URL: https://aclanthology.org/N19 -1423. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          - 1423.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Buonocore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Crema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Redolfi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bellazzi</surname>
          </string-name>
          , E. Parimbelli, Localising In-Domain
          <source>Adaptation of Transformer-Based Biomedical Language Models</source>
          ,
          <year>2022</year>
          . arXiv:
          <volume>2212</volume>
          .
          <fpage>10422</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C. P.</given-names>
            <surname>Carrino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Armengol-Estapé</surname>
          </string-name>
          , A. GutiérrezFandiño, J.
          <string-name>
            <surname>Llop-Palao</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Pàmies</surname>
          </string-name>
          , A. GonzalezAgirre, M. Villegas,
          <article-title>Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <volume>2109</volume>
          .
          <fpage>03570</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Türkmen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Dikenelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Eraslan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Çalli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Ozbek</surname>
          </string-name>
          ,
          <article-title>Developing Pretrained Language Models for Turkish Biomedical Domain</article-title>
          ,
          <source>in: 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI)</source>
          , IEEE,
          <year>2022</year>
          , pp.
          <fpage>597</fpage>
          -
          <lpage>598</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Labrak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bazoge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dufour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rouvier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Morin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Daille</surname>
          </string-name>
          , P.-A. Gourraud,
          <article-title>DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains, in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics</article-title>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , Toronto, Canada,
          <year>2023</year>
          , pp.
          <fpage>16207</fpage>
          -
          <lpage>16221</lpage>
          . URL: https: //aclanthology.org/
          <year>2023</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>896</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Altuna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Karunakaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lavelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Magnini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Speranza</surname>
          </string-name>
          , R. Zanoli, CLinkaRT at EVALITA 2023:
          <article-title>Overview of the Task on Linking a Lab Result to its Test Event in the Clinical Domain, in: Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</article-title>
          .
          <source>Final Workshop (EVALITA</source>
          <year>2023</year>
          ), CEUR.org, Parma, Italy,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Menini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Polignano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          , G. Venturi,
          <string-name>
            <surname>EVALITA</surname>
          </string-name>
          <year>2023</year>
          :
          <article-title>Overview of the 8th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian, in: Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</article-title>
          .
          <source>Final Workshop (EVALITA</source>
          <year>2023</year>
          ), CEUR.org, Parma, Italy,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rastegar-Mojarad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Moon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Afzal</surname>
          </string-name>
          , S. Liu,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mehrabi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sohn</surname>
          </string-name>
          , H. Liu,
          <article-title>Clinical information extraction applications: A literature review</article-title>
          ,
          <source>Journal of Biomedical Informatics</source>
          <volume>77</volume>
          (
          <year>2018</year>
          )
          <fpage>34</fpage>
          -
          <lpage>49</lpage>
          . URL: https://www.sciencedir ect.com/science/article/pii/S1532046417302563. doi:https://doi.org/10.1016/j.jbi.
          <year>2017</year>
          .
          <volume>11</volume>
          . 011.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Perera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dehmer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Emmert-Streib</surname>
          </string-name>
          ,
          <article-title>Named Entity Recognition and Relation Detection for Biomedical Information Extraction</article-title>
          ,
          <source>Frontiers in Cell and Developmental Biology</source>
          <volume>8</volume>
          (
          <year>2020</year>
          ). URL: https://ww w.frontiersin.org/articles/10.3389/fcell.
          <year>2020</year>
          .
          <volume>00673</volume>
          . doi:
          <volume>10</volume>
          .3389/fcell.
          <year>2020</year>
          .
          <volume>00673</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>B.</given-names>
            <surname>Magnini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Altuna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lavelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Speranza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zanoli</surname>
          </string-name>
          ,
          <article-title>The E3C Project: Collection and Annotation of a Multilingual Corpus of Clinical Cases</article-title>
          , in: J.
          <string-name>
            <surname>Monti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Dell'Orletta</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Tamburini</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the Seventh Italian Conference on Computational Linguistics</source>
          , volume
          <volume>2769</volume>
          of CLiC-It, CEUR-WS, Milan Italy,
          <year>2020</year>
          , pp.
          <fpage>422</fpage>
          -
          <lpage>431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B.</given-names>
            <surname>Altuna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Agerri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Salas-Espejo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Saiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lavelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Magnini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Speranza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zanoli</surname>
          </string-name>
          , G. Karunakaran, Overview of TESTLINK at IberLEF 2023:
          <article-title>Linking Results to Clinical Laboratory Tests</article-title>
          and Measurements,
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>71</volume>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>E.</given-names>
            <surname>Ma</surname>
          </string-name>
          , NLP Augmentation, https://github.com/makcedward/nlpaug,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Debut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sanh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chaumond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Delangue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cistac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Louf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Funtowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Davison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shleifer</surname>
          </string-name>
          , P. von Platen, C. Ma,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jernite</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Plu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. L.</given-names>
            <surname>Scao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gugger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Drame</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Lhoest</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Rush</surname>
          </string-name>
          , Transformers:
          <article-title>State-of-the-Art Natural Language Processing</article-title>
          ,
          <source>in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>38</fpage>
          -
          <lpage>45</lpage>
          . URL: https:// www.aclweb.org/anthology/2020.emnlp-demos.
          <volume>6</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>I.</given-names>
            <surname>Montani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Honnibal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Honnibal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. V.</given-names>
            <surname>Landeghem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Boyd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. O.</given-names>
            <surname>McCann</surname>
          </string-name>
          , jim geovedi,
          <string-name>
            <surname>J. O'Regan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Samsonov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Orosz</surname>
            , D. de Kok,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Blättermann</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Altinok</surname>
            ,
            <given-names>S. L.</given-names>
          </string-name>
          <string-name>
            <surname>Kristiansen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Kannan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Mitsch</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Bournhonesque</surname>
            , Edward,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Miranda</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Baumgartner</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Hudson</surname>
            , E. Bot, Roman,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Fiedler</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Daniels</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Phatthiyaphaibun</surname>
            , G. Howard,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Tamura</surname>
          </string-name>
          , spaCy: Industrial-strength
          <source>Natural Language Processing in Python</source>
          ,
          <year>2023</year>
          . URL: https://doi.org/10.5281/zenodo .7715077. doi:
          <volume>10</volume>
          .5281/zenodo.7715077.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schweter</surname>
          </string-name>
          ,
          <string-name>
            <surname>Italian</surname>
            <given-names>BERT</given-names>
          </string-name>
          <source>and ELECTRA models</source>
          ,
          <year>2020</year>
          . URL: https://doi.org/10.5281/zenodo.4263142. doi:
          <volume>10</volume>
          .5281/zenodo.4263142.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sanh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Debut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chaumond</surname>
          </string-name>
          , T. Wolf,
          <article-title>DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter</article-title>
          , ArXiv abs/
          <year>1910</year>
          .01108 (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>