<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Multilingual Clinical NER for Diseases and Medications Recognition in Cardiology Texts using BERT Embeddings</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Manuela Daniela Danu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>George Marica</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Constantin Suciu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lucian Mihai Itu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oladimeji Farri</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Advanta</institution>
          ,
          <addr-line>Siemens SRL, 15 Noiembrie Bvd, 500097 Brasov</addr-line>
          ,
          <country country="RO">Romania</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Automation and Information Technology, Transilvania University of Brasov</institution>
          ,
          <addr-line>5 Mihai Viteazul Street, 500174 Brasov</addr-line>
          ,
          <country country="RO">Romania</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Digital Technology and Innovation</institution>
          ,
          <addr-line>Siemens Healthineers, 755 College Rd E, 08540 Princeton, NJ</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The rapidly increasing volume of electronic health record (EHR) data underscores a pressing need to unlock biomedical knowledge from unstructured clinical texts to support advancements in data-driven clinical systems, including patient diagnosis, disease progression monitoring, treatment efects assessment, prediction of future clinical events, etc. While contextualized language models have demonstrated impressive performance improvements for named entity recognition (NER) systems in English corpora, there remains a scarcity of research focused on clinical texts in low-resource languages. To bridge this gap, our study aims to develop multiple deep contextual embedding models to enhance clinical NER in the cardiology domain, as part of the BioASQ MultiCardioNER shared task. We explore the efectiveness of diferent monolingual and multilingual BERT-based models, trained on general domain text, for extracting disease and medication mentions from clinical case reports written in English, Spanish, and Italian. We achieved an F1-score of 77.88% on Spanish Diseases Recognition (SDR), 92.09% on Spanish Medications Recognition (SMR), 91.74% on English Medications Recognition (EMR), and 88.9% on Italian Medications Recognition (IMR). These results outperform the mean and median F1 scores in the test leaderboard across all subtasks, with the mean/median values being: 69.61%/75.66% for SDR, 81.22%/90.18% for SMR, 89.2%/88.96% for EMR, and 82.8%/87.76% for IMR.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        With the increasing amount of available electronic health record (EHR) data, clinical natural language
processing (NLP) tasks have become significantly important for extracting valuable information from
unstructured clinical texts [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Named Entity Recognition (NER) is a key NLP task used to identify
meaningful entities within these texts, such as anatomical structures, diseases and disorders, signs
and symptoms, procedures, and medications [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Consequently, this facilitates various data analysis
applications, ranging from predicting future clinical events [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] to summarization [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and relation
extraction between entities (e.g., drug-to-drug interactions [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], symptom-disease relationship [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
patient-procedure association [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], etc.)
      </p>
      <p>
        Despite recent advances in deep learning methods for NER [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ], extracting structured information
from the vast amounts of unstructured and often noisy clinical documents in EHR systems remains
challenging due to the highly specialized medical language, which varies considerably across diferent
medical specialties, as well as due to the prevalence of misspellings, abbreviations, and use of synonyms
to express clinical concepts [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        While contextualized language models have recently improved the performance of NER systems for
English corpora [
        <xref ref-type="bibr" rid="ref10 ref11 ref8">8, 10, 11</xref>
        ], there is a notable lack of research focused on clinical texts in low-resource
languages. To address this gap, our study aims to develop multiple deep contextual embedding models
for English, Spanish, and Italian to enhance clinical NER in the cardiology domain, as part of the
MultiCardioNER shared task [
        <xref ref-type="bibr" rid="ref12 ref13 ref14">12, 13, 14</xref>
        ]. The MultiCardioNER task is part of the twelfth edition of
the large-scale biomedical semantic indexing and question answering challenge (BioASQ) [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ], a
long-standing initiative aiming to advance research by developing methods and tools that leverage
the vast amount of online information to meet the needs of biomedical researchers and practitioners.
This initiative seeks to provide eficient and rapid access to the continuously expanding resources and
knowledge in the biomedical field.
      </p>
      <p>
        MultiCardioNER [
        <xref ref-type="bibr" rid="ref12 ref13 ref14">12, 13, 14</xref>
        ] is a shared task that aims to automatically identify two key clinical
concepts in medical documents pertaining to cardiology, namely diseases and medications. This task
focuses on adapting clinical NER systems to efectively work across multiple languages - primarily
Spanish, English, and Italian - for two diferent subtasks: (1) diseases recognition in Spanish cardiology
texts, and (2) medications recognition in cardiology texts written in Spanish, English, and Italian. Both
subtasks involve reading and analyzing clinical texts to identify the clinical entities mentioned in the
text and using the BRAT format to mark the starting and ending positions of these entities.
      </p>
      <p>
        In this paper, we created four diferent monolingual models: (1) Spanish Diseases Recognition (SDR),
(2) Spanish Medications Recognition (SMR), (3) English Medications Recognition (EMR), and (4) Italian
Medications Recognition (IMR). Additionally, we developed two multilingual models: one specialized
for Spanish Diseases Recognition (Multi-SDR) and another for Medications Recognition across all three
targeted languages (Multi-MMR). We applied transfer learning techniques by fine-tuning BERT-based
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] contextual embeddings, originally trained on general domain text in each of the three languages,
for the biomedical domain to extract diseases and medications from clinical reports.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        In clinical and biomedical NER, recent studies have explored various methodologies to enhance
performance. A key model in this domain is multilingual BERT (M-BERT) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], trained on 104 Wikipedia
languages, which excels in various tasks without explicit cross-lingual alignment [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], outperforming
models based on cross-lingual embeddings [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] improved biomedical NER by incorporating syntactic information, enhancing recognition of
complex entity relationships (ORCID). [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] focused on de-identifying Spanish medical texts via NER and
entity randomization, achieving high recall rates on radiology reports and MEDDOCAN [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] challenge
data. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] developed BioELECTRA, a biomedical text encoder using discriminators, which outperformed
several baselines on multiple biomedical NER benchmarks by leveraging ELECTRA’s eficiency and
accuracy in text encoding.
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] developed a scalable NER system for large biomedical datasets, emphasizing real-time processing
and high accuracy. [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] focused on pre-trained biomedical language models for clinical NLP in Spanish,
addressing the need for multilingual capabilities in biomedical NER. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] optimized a bi-encoder for
NER using contrastive learning, introducing dynamic thresholding to improve accuracy, especially for
nested entities, with significant gains on datasets like ACE [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] and GENIA [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] used a novel
schema with distant supervision to enhance NER accuracy, showing that domain-specific schema can
supplement limited annotated data efectively.
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] used ChatGPT [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] for zero-shot clinical entity recognition with prompt engineering, showing
it outperforms GPT-3 [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] but trails behind fine-tuned BioClinicalBERT [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] models. [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] leveraged
transfer learning and asymmetric tri-training, combining labeled and pseudo-labeled data to boost NER
performance across biomedical datasets.
      </p>
      <p>
        To advance the development of medical NER systems, the BioASQ challenge proposed multiple
clinical NER tasks to be solved over time, such as automatic detection and normalization of disease
mentions from clinical texts (DisTEMIST) [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] or medical procedure detection and entity linking
(MedProcNER) [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ]. Most participating teams employed Transformer-based and large language models
in their approaches.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Datasets</title>
        <p>
          With a focus on adapting general medical NER systems for diseases and medications across multiple
languages, the MultiCardioNER [
          <xref ref-type="bibr" rid="ref12 ref33">12, 33</xref>
          ] task leverages several datasets. Specifically, it utilizes a training
collection of 1000 general clinical case reports in Spanish, covering various medical specialties such as
oncology, urology, ophthalmology, dentistry, pediatrics, primary care, allergology, radiology, psychiatry,
and more [
          <xref ref-type="bibr" rid="ref31 ref33">33, 31</xref>
          ]. These reports were annotated with diseases and medications, resulting in two
distinct corpora, namely DisTEMIST [
          <xref ref-type="bibr" rid="ref31 ref33">33, 31, 34</xref>
          ] and DrugTEMIST [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ]. The DrugTEMIST [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] corpus
was also released in English and Italian. Since the original 1000 clinical case reports belong to the
Spanish Clinical Case Corpus (SPACCC) [35], the multilingual DrugTEMIST [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] dataset was originally
created in Spanish and then transferred into English and Italian using machine translation and lexical
annotation projection. The result of this process was revised and validated by clinical experts who are
native speakers of each language.
        </p>
        <p>
          For the domain adaptation part of the task, MultiCardioNER [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] leverages a collection of 508
annotated cardiology clinical case reports (CardioCCC), divided into 258 for development and 250 for
testing. The annotation process followed the same guidelines as the DisTEMIST [36] and DrugTEMIST
[37] corpora, with the medication part also released in Spanish, English and Italian. In addition to the
test set, an auxiliary collection of multilingual clinical case reports, referred to as the background set, is
provided to facilitate the creation of a silver standard corpus and ensure the developed systems can
efectively scale up to larger content collections.
        </p>
        <p>All datasets were manually annotated by clinical experts using the BRAT annotation tool [38],
following well-defined annotation guidelines [ 36, 37] defined after several cycles of quality control and
annotation consistency analysis.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Experiments</title>
        <p>
          In this work, we treated the automatic named entity recognition (NER) of diseases and medications
in clinical case reports as a multi-label token classification task. To accomplish this, we employed
pre-existing BERT models [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] for NER in the general domain for each of the three languages (Spanish,
English, and Italian), as well as a multilingual model, and further fine-tuned them for the biomedical
domain using the MultiCardioNER dataset [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ].
        </p>
        <p>We experimented with the following BERT-based models, specifically trained to perform NER:
• bert-spanish-cased-finetuned-ner [39]: a Spanish BERT cased model based on BETO [40].</p>
        <p>Originally fine-tuned on the Spanish dataset of the CoNLL-2002 Shared Task [ 41], BETO was
further fine-tuned on the Catalan and Basque subsets of the CoNLL-2007 dataset [ 42], resulting
in the bert-spanish-cased-finetuned-ner model, which focuses on recognizing persons (PER),
organizations (ORG), locations (LOC), and miscellaneous (MISC) entities within Spanish text
documents.
• bert-base-NER [43]: a BERT cased model fine-tuned on the English version of the standard
CoNLL-2003 dataset [44]. It was trained to recognize four types of entities, namely locations
(LOC), organizations (ORG), persons (PER), and miscellaneous (MISC).
• bert-italian-finetuned-ner [45]: an Italian BERT cased model fine-tuned on the WikiANN
dataset [46], which consists of Wikipedia articles annotated with LOC (location), PER (person),
and ORG (organisation) tags.
• bert-base-multilingual-cased-ner-hrl [47]: a named entity recognition model for 10
highresourced languages (Arabic, German, English, Spanish, French, Italian, Latvian, Dutch,
Portuguese and Chinese) based on a fine-tuned multilingual cased BERT model. It has been trained
to recognize three types of entities: locations (LOC), organizations (ORG), and persons (PER).</p>
        <p>All these BERT-based models utilize the standard Beginning-Inside-Outside (BIO) format [48] for
tagging entities. This format is crucial as it allows NER to be approached as a multi-label classification
task, where words are labelled B if they represent the beginning of an entity, I if they are inside an
entity, and O if they are outside any entity. This labeling method efectively distinguishes between the
beginning and continuation of an entity, thereby simplifying the task of identifying entity boundaries.</p>
        <p>Before performing the clinical domain adaptation of the general domain BERT-based models, the
medical reports undergo a pre-processing step which involves splitting them into sentences to ensure a
sequence length of less than or equal to 256. These sentences are then further segmented into word-level
tokens while preserving their start and end ofsets with respect to the original report. The word-level
tokens are encoded in BIO format and used to fine-tune BERT-based models on the MultiCardioNER
dataset. The output from the BERT models is then post-processed to comply with BRAT format. Figure 1
provides an overview of the prediction pipeline.</p>
        <p>Details regarding the label lists used for each subtask, as well as the hyper-parameters configuration
employed in the experiments, are provided in sections 3.2.1 and 3.2.2, respectively.</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Subtask 1: Diseases Recognition in Spanish Cardiology Texts</title>
          <p>For the subtask aiming to address the recognition of diseases in Spanish cardiology texts, we leveraged
the pre-trained bert-spanish-cased-finetuned-ner and bert-base-multilingual-cased-ner-hrl models and
further fine-tuned them on the MultiCardioNER dataset. Specifically, we employed the DisTEMIST
corpora as the training set for the general clinical domain adaptation part of the task and used the
disease-annotated version of the Spanish CardioCCC clinical cases as the development set to identify the
best-performing models in the cardiology domain, resulting in the Clinical-SDR and MultiClinical-SDR
models. We additionally experimented with fine-tuning these models on the CardioCCC development
set, leading to the creation of the cardiology-specialized Cardio-SDR and MultiCardio-SDR models.</p>
          <p>Following the standard BIO format [48], we defined our label list as follows: B-ENFERMEDAD,
IENFERMEDAD, O, [CLS], and [SEP]. B-ENFERMEDAD and I-ENFERMEDAD denote the beginning and
continuation of disease mentions within text sequences, whereas the label O corresponds to word-level
tokens outside any recognized entity. Additionally, the [CLS] token indicates the commencement of
a sentence, while the [SEP] token marks its termination. Notably, the [CLS] token also serves as a
placeholder for the [PAD] token within the text sequences.</p>
          <p>The Spanish Diseases Recognition (SDR) models were fine-tuned on an NVIDIA GeForce RTX 3090
(24GB) GPU for 10 epochs. The utilized hyper-parameters configuration includes a maximum sequence
length of 256, a batch size of 8, and a learning rate of 9− 6. Predictions were generated for both the
test and background sets, but the evaluation exclusively considered the predictions achieved on the
test set. In addition to the test set results, we also reported the development set results to identify any
discrepancies between them and, hence, detect potential overfitting or any issues related to the data
split distributions.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Subtask 2: Multilingual (Spanish, English and Italian) Medications Recognition in</title>
        </sec>
        <sec id="sec-3-2-3">
          <title>Cardiology Texts</title>
          <p>For the second subtask, which focuses on the recognition of medications in cardiology texts written
in Spanish, English, and Italian, we employed three monolingual pre-trained models
(bert-spanishcased-finetuned-ner , bert-base-NER, and bert-italian-finetuned-ner ), each specialized for one of the three
languages, as well as a multilingual model (bert-base-multilingual-cased-ner-hrl), and subsequently
ifne-tuned them on the MultiCardioNER dataset. Therefore, we leveraged the DrugTEMIST corpora in
each of the three languages as the training sets for the general clinical domain adaptation part of the
task and used the medication-annotated version of the CardioCCC clinical cases in Spanish, English,
and Italian as the development sets to identify the best performing models in the cardiology domain,
resulting in the Clinical-SMR, Clinical-EMR, Clinical-IMR, and MultiClinical-MMR models. We again
conducted additional experiments by fine-tuning these models on the CardioCCC development sets,
thereby achieving the cardiology-specialized Cardio-SMR, Cardio-EMR, Cardio-IMR, and
MultiCardioMMR models. It is worth noting that the multilingual model was trained on an aggregated dataset
encompassing all three languages, but separately evaluated for each language to assess its performance
across diferent linguistic contexts.</p>
          <p>In accordance with the standard BIO format [48], we defined the label list for this subtask as
BFARMACO, I-FARMACO, O, [CLS], and [SEP]. The tags B-FARMACO and I-FARMACO denote the
beginning and continuation of medication mentions within text sequences, while the label O marks the
word-level tokens not associated with any recognized entity. Additionally, as in Subtask 1, the [CLS]
token indicates the beginning of a sentence, while the [SEP] token marks its end. In this context, the
[CLS] token also serves as a placeholder for the [PAD] token within the text sequences.</p>
          <p>The Spanish Medications Recognition (SMR), English Medications Recognition (EMR), Italian
Medications Recognition (IMR), and Multilingual Medications Recognition (MMR) models were independently
ifne-tuned on an NVIDIA GeForce RTX 3090 (24GB) GPU for 10 epochs. The utilized hyper-parameters
configuration is identical to that employed for Subtask 1 and consists of a maximum sequence length
of 256, a batch size of 8, and a learning rate of 9− 6. Predictions were generated for both the test and
background sets. However, the evaluation exclusively considered the predictions obtained on the test
set. In addition to the test set results, we also reported the development set results to identify any
mismatch between them, which could indicate overfitting or issues related to the data split distributions.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>In this work, we evaluated the developed systems using a flat evaluation approach [ 49] by comparing
the automatically generated results with those obtained by domain experts through manual annotation.
The primary focus was on identifying and classifying clinical mentions of diseases and medications
in cardiology reports. The performance metrics employed for flat evaluation include micro-averaged
precision, recall, and F1-score (MiF). These metrics were computed based on the exact matches of the
predicted entities and the annotated ground-truth. Table 1 summarises the evaluation results obtained
on the development and test sets using the oficial-released evaluation library for the MultiCardioNER
task. In the test set evaluation, we achieved the following F1-scores: 77.88% for Spanish Diseases
Recognition (SDR), 92.09% for Spanish Medications Recognition (SMR), 91.74% for English Medications
Recognition (EMR), and 88.9% for Italian Medications Recognition (IMR).</p>
      <p>Fine-tuning PreDceisvion RDeceavll F1D-secvore PreTceisstion RTeecsatll F1T-secsotre</p>
      <p>No
Yes
No
Yes
No
Yes
No
Yes
No
Yes
No
Yes
No
Yes
No
Yes</p>
      <p>These results surpass the mean and median F1 scores in the test leaderboard across all subtasks, with
the mean/median values being: 69.61%/75.66% for SDR, 81.22%/90.18% for SMR, 89.2%/88.96% for EMR,
and 82.8%/87.76% for IMR.</p>
      <p>The experiments marked with an (*) in Table 1 were conducted after the MultiCardioNER evaluation
period and are not included in the oficial leaderboard. However, these supplementary experiments
provide further insights beyond the primary evaluation results. For instance, the fine-tuning process
considerably enhances performance across all developed systems. Additionally, employing a multilingual
model proves beneficial in certain substasks, such as Spanish Medications Recognition (SMR) and
English Medications Recognition (EMR), resulting in an improved F1-score from 91.65% (achieved by
the subsequent best performing model) to 92.09%, and from 91.46% to 91.74%, respectively.</p>
      <p>By comparing the results from the development and test sets, we can assess potential discrepancies
between these data splits and identify issues such as overfitting or distributional disparities. These
insights are crucial for enhancing model robustness and generalization, which are essential for
successfully utilizing the developed systems in real-world clinical scenarios. As illustrated in Table 1,
non-fine-tuned models exhibit similar evaluation metrics on both the development and test sets. For
these models, the development set was solely used to select the best-performing model across diferent
checkpoints. This consistency confirms that the two data splits originate from the same distribution. In
contrast, fine-tuned models – trained on the development set – demonstrate a performance gap between
the two sets. While some degree of performance diference is expected due to the model’s exposure to
the development data during training, excessively large gaps suggest overfitting. This is the case of
Spanish Diseases Recognition (SDR) models, where the performance gap between the development
and test sets is 18.35% for Cardio-SDR and 16.3% for MultiCardio-SDR. For all other fine-tuned models,
the F1-score on the development set is only slightly higher than that computed on the test set, with
diferences ranging from 1.95% to 7.44%. Although these diferences may indicate some overfitting, they
do not reach a severe extent. One plausible explanation for overfitting in these cases could be that the
model is too complex for the limited diversity of cardiology-specific entities present in the development
set. As a result, the model may capture specific patterns from the training data but struggle to generalize
to new data.</p>
      <p>In addition to this performance analysis, we conducted a qualitative evaluation of the top-performing
models across all subtasks. The qualitative analysis complements the quantitative metrics, providing a
comprehensive assessment of the capabilities of the developed models in real-world clinical scenarios.
The outcomes, as illustrated in Figure 2, Figure 3, Figure 4, and Figure 5, indicate that the models
perform commendably in identifying medications within clinical texts across all three targeted
languages. However, the Spanish Diseases Recognition (SDR) model exhibits room for improvement, as it
occasionally produces incomplete or incorrect predictions.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>In this paper, we investigated the utilization of BERT-based contextual embeddings, trained on general
domain texts, for extracting mentions of diseases and medications from clinical case reports written in
English, Spanish, and Italian. We developed four distinct monolingual models: (1) Spanish Diseases
Recognition (SDR), (2) Spanish Medications Recognition (SMR), (3) English Medications Recognition
(EMR), and (4) Italian Medications Recognition (IMR). Additionally, we created two multilingual models:
one specialized for Spanish Diseases Recognition (Multi-SDR) and another for Medications Recognition
across all three targeted languages (Multi-MMR). While the results show promising performance in
identifying medications within clinical texts across all three languages, the models are not flawless.
Some weaknesses arise in diseases recognition, where they occasionally produce incomplete or incorrect
predictions. To address these issues, we aim to explore the capabilities of recent large language models
(LLMs).</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work received funding from the European Union’s Horizon Europe research and innovation
programme under Grant Agreement No. 101057849 (DataTools4Heart project).
[34] A. Miranda-Escalada, E. Farré, L. Gasco, S. Lima, M. Krallinger, DisTEMIST corpus: detection and
normalization of disease mentions in spanish clinical cases, 2023. URL: https://doi.org/10.5281/
zenodo.7614764. doi:10.5281/zenodo.7614764.
[35] A. Intxaurrondo, M. Krallinger, Spaccc, 2019. URL: https://doi.org/10.5281/zenodo.2560316. doi:10.</p>
      <p>5281/zenodo.2560316.
[36] E. Farré-Maduell, L. Gascó, S. Lima, A. Miranda-Escalada, M. Krallinger, DisTEMIST Guidelines:
detection and normalization of disease mentions in spanish clinical cases, 2022. URL: https://doi.
org/10.5281/zenodo.6477407. doi:10.5281/zenodo.6477407.
[37] S. Lima-López, E. Farré-Maduell, M. Krallinger, DrugTEMIST Guidelines: Annotation of
Medication in Medical Documents, 2024. URL: https://doi.org/10.5281/zenodo.11065433. doi:10.5281/
zenodo.11065433.
[38] P. Stenetorp, S. Pyysalo, G. Topić, T. Ohta, S. Ananiadou, J. Tsujii, Brat: a web-based tool for
nlp-assisted text annotation, in: Proceedings of the Demonstrations at the 13th Conference of the
European Chapter of the Association for Computational Linguistics, 2012, pp. 102–107.
[39] M. Romero, bert-spanish-cased-finetuned-ner, 2020. URL: https://huggingface.co/mrm8488/
bert-spanish-cased-finetuned-ner, accessed: 2024-06-07.
[40] J. Cañete, G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, J. Pérez, Spanish pre-trained bert model and
evaluation data, in: PML4DC at ICLR 2020, 2020.
[41] E. F. Tjong Kim Sang, Introduction to the CoNLL-2002 shared task: Language-independent named
entity recognition, in: COLING-02: The 6th Conference on Natural Language Learning 2002
(CoNLL-2002), 2002. URL: https://aclanthology.org/W02-2024.
[42] J. Nivre, J. Hall, S. Kübler, R. McDonald, J. Nilsson, S. Riedel, D. Yuret, The conll 2007 shared task
on dependency parsing, in: Proceedings of the 2007 Joint Conference on Empirical Methods in
Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL),
2007, pp. 915–932.
[43] D. S. Lim, bert-base-ner, 2020. URL: https://huggingface.co/dslim/bert-base-NER, accessed:
202406-07.
[44] E. F. Tjong Kim Sang, F. De Meulder, Introduction to the CoNLL-2003 shared task:
Languageindependent named entity recognition, in: Proceedings of the Seventh Conference on Natural
Language Learning at HLT-NAACL 2003, 2003, pp. 142–147. URL: https://www.aclweb.org/anthology/
W03-0419.
[45] N. Procopio, bert-italian-finetuned-ner, 2023. URL: https://huggingface.co/nickprock/
bert-italian-finetuned-ner, accessed: 2024-06-07.
[46] X. Pan, B. Zhang, J. May, J. Nothman, K. Knight, H. Ji, Cross-lingual name tagging and linking for
282 languages, in: R. Barzilay, M.-Y. Kan (Eds.), Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational
Linguistics, Vancouver, Canada, 2017, pp. 1946–1958. URL: https://aclanthology.org/P17-1178.
doi:10.18653/v1/P17-1178.
[47] D. Adelani, bert-base-multilingual-cased-ner-hrl, 2021. URL: https://huggingface.co/Davlan/
bert-base-multilingual-cased-ner-hrl, accessed: 2024-06-10.
[48] L. A. Ramshaw, M. P. Marcus, Text chunking using transformation-based learning, in: Natural
language processing using very large corpora, Springer, 1999, pp. 157–176.
[49] A. Kosmopoulos, I. Partalas, E. Gaussier, G. Paliouras, I. Androutsopoulos, Evaluation measures
for hierarchical classification: a unified view and novel approaches, Data Mining and Knowledge
Discovery 29 (2015) 820–865.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E. T. Rubel</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Andrioli de Souza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Knafou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. E.</given-names>
            <surname>Oliveira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. B.</given-names>
            <surname>Gumiel</surname>
          </string-name>
          , L. F. de Oliveira,
          <string-name>
            <given-names>D.</given-names>
            <surname>Teodoro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. C.</given-names>
            <surname>Paraiso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Moro</surname>
          </string-name>
          , et al.,
          <article-title>Biobertpt: a portuguese neural language model for clinical named entity recognition</article-title>
          ,
          <source>in: Proceedings of the 3rd Clinical Natural Language Processing Workshop, 19 November</source>
          <year>2020</year>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Kundeti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vijayananda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mujjiga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kalyan</surname>
          </string-name>
          ,
          <article-title>Clinical named entity recognition: Challenges and opportunities</article-title>
          ,
          <source>in: 2016 IEEE International Conference on Big Data (Big Data)</source>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>1937</fpage>
          -
          <lpage>1945</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Bahadori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Colak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhatia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Celikkaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bhakta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Senthivel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Khalilia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Navarro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , et al.,
          <article-title>Improving hospital mortality prediction with medical named entities and multimodal learning</article-title>
          , arXiv preprint arXiv:
          <year>1811</year>
          .
          <volume>12276</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Riccio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Romano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Korsun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cirillo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Postiglione</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>La Gatta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferraro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Galli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Moscato</surname>
          </string-name>
          ,
          <article-title>Healthcare data summarization via medical entity recognition and generative ai (</article-title>
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Zaikis</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Vlahavas</surname>
          </string-name>
          ,
          <article-title>Drug-drug interaction classification using attention based neural networks</article-title>
          ,
          <source>in: 11th Hellenic conference on artificial intelligence</source>
          ,
          <source>2020</source>
          , pp.
          <fpage>34</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Abulaish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Parwez</surname>
          </string-name>
          , et al.,
          <article-title>Disease: A biomedical text analytics system for disease symptom extraction and characterization</article-title>
          ,
          <source>Journal of Biomedical Informatics</source>
          <volume>100</volume>
          (
          <year>2019</year>
          )
          <fpage>103324</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Rink</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Harabagiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <article-title>Automatic extraction of relations between medical concepts in clinical texts</article-title>
          ,
          <source>Journal of the American Medical Informatics Association</source>
          <volume>18</volume>
          (
          <year>2011</year>
          )
          <fpage>594</fpage>
          -
          <lpage>600</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yoon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>So</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kang</surname>
          </string-name>
          ,
          <article-title>Biobert: a pre-trained biomedical language representation model for biomedical text mining</article-title>
          ,
          <source>Bioinformatics</source>
          <volume>36</volume>
          (
          <year>2020</year>
          )
          <fpage>1234</fpage>
          -
          <lpage>1240</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Kanakarajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kundumani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sankarasubbu</surname>
          </string-name>
          ,
          <article-title>Bioelectra: pretrained biomedical text encoder using discriminators</article-title>
          ,
          <source>in: Proceedings of the 20th workshop on biomedical language processing</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>143</fpage>
          -
          <lpage>154</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>E.</given-names>
            <surname>Alsentzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Murphy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Boag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-H.</given-names>
            <surname>Weng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Naumann</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>McDermott, Publicly available clinical bert embeddings</article-title>
          , arXiv preprint arXiv:
          <year>1904</year>
          .
          <volume>03323</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jin</surname>
          </string-name>
          , W. Liu,
          <string-name>
            <given-names>B. P. S.</given-names>
            <surname>Rawat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yu</surname>
          </string-name>
          , et al.,
          <article-title>Fine-tuning bidirectional encoder representations from transformers (bert)-based models on large-scale electronic health record notes: an empirical study</article-title>
          ,
          <source>JMIR medical informatics 7</source>
          (
          <year>2019</year>
          )
          <article-title>e14830</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lima-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Farré-Maduell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rodríguez-Miret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rodríguez-Ortega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lilli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lenkowicz</surname>
          </string-name>
          , G. Ceroni,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kossof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nentidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          , G. Katsimpras, G. Paliouras,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Krallinger, Overview of MultiCardioNER task at BioASQ 2024 on Medical Speciality and Language Adaptation of Clinical NER Systems for Spanish, English and Italian</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . García Seco de Herrera (Eds.),
          <source>Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nentidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          , G. Paliouras,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krallinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Sanchez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Farre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Loukachevitch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Davydova</surname>
          </string-name>
          , E. Tutubalina, Bioasq at clef2024:
          <article-title>The twelfth edition of the largescale biomedical semantic indexing and question answering challenge</article-title>
          ,
          <source>in: European Conference on Information Retrieval</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>490</fpage>
          -
          <lpage>497</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nentidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Katsimpras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lima-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Farré-Maduell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krallinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Loukachevitch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Davydova</surname>
          </string-name>
          , E. Tutubalina, G. Paliouras,
          <source>Overview of BioASQ</source>
          <year>2024</year>
          :
          <article-title>The twelfth BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Maria Di Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>García Seco de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ),
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T.</given-names>
            <surname>Pires</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Schlinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Garrette</surname>
          </string-name>
          ,
          <article-title>How multilingual is multilingual BERT?</article-title>
          , in: A.
          <string-name>
            <surname>Korhonen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Traum</surname>
          </string-name>
          , L. Màrquez (Eds.),
          <article-title>Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Florence, Italy,
          <year>2019</year>
          , pp.
          <fpage>4996</fpage>
          -
          <lpage>5001</lpage>
          . URL: https://aclanthology.org/P19-1493. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P19</fpage>
          -1493.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dredze</surname>
          </string-name>
          ,
          <article-title>Beto, bentz, becas: The surprising cross-lingual efectiveness of bert</article-title>
          , arXiv preprint arXiv:
          <year>1904</year>
          .
          <volume>09077</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Improving biomedical named entity recognition with syntactic information</article-title>
          ,
          <source>BMC bioinformatics 21</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>I.</given-names>
            <surname>Pérez-Díez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pérez-Moraga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>López-Cerdán</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-M.</given-names>
            <surname>Salinas-Serrano</surname>
          </string-name>
          , M. d. la Iglesia-Vayá,
          <article-title>Deidentifying spanish medical texts-named entity recognition applied to radiology reports</article-title>
          ,
          <source>Journal of Biomedical Semantics</source>
          <volume>12</volume>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>M.</given-names>
            <surname>Marimon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gonzalez-Agirre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Intxaurrondo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Lopez Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Villegas</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Krallinger, MEDDOCAN corpus: gold standard annotations for Medical Document Anonymization on Spanish clinical case reports, 2020</article-title>
          . URL: https://doi.org/10.5281/zenodo.4279323. doi:
          <volume>10</volume>
          .5281/zenodo.4279323.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>V.</given-names>
            <surname>Kocaman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Talby</surname>
          </string-name>
          ,
          <article-title>Biomedical named entity recognition at scale, in: Pattern Recognition</article-title>
          .
          <source>ICPR International Workshops and Challenges: Virtual Event, January 10-15</source>
          ,
          <year>2021</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>I</given-names>
          </string-name>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>635</fpage>
          -
          <lpage>646</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>C. P.</given-names>
            <surname>Carrino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Llop</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pàmies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gutiérrez-Fandiño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Armengol-Estapé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Silveira-Ocampo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Valencia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gonzalez-Agirre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Villegas</surname>
          </string-name>
          ,
          <article-title>Pretrained biomedical language models for clinical nlp in spanish</article-title>
          ,
          <source>in: Proceedings of the 21st Workshop on Biomedical Language Processing</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>193</fpage>
          -
          <lpage>199</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , H. Cheng, J.
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Poon</surname>
          </string-name>
          ,
          <article-title>Optimizing bi-encoder for named entity recognition via contrastive learning</article-title>
          ,
          <source>arXiv preprint arXiv:2208.14565</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>C.</given-names>
            <surname>Walker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. D.</given-names>
            <surname>Consortium</surname>
          </string-name>
          ,
          <article-title>ACE 2005 Multilingual Training Corpus, LDC corpora</article-title>
          ,
          <source>Linguistic Data Consortium</source>
          ,
          <year>2005</year>
          . URL: https://books.google.at/books?id=SbjjuQEACAAJ.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ohta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tateisi</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-D. Kim</surname>
          </string-name>
          ,
          <article-title>The genia corpus: an annotated research abstract corpus in molecular biology domain</article-title>
          ,
          <source>in: Proceedings of the Second International Conference on Human Language Technology Research</source>
          , HLT '
          <fpage>02</fpage>
          , Morgan Kaufmann Publishers Inc., San Francisco, CA, USA,
          <year>2002</year>
          , p.
          <fpage>82</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>A.</given-names>
            <surname>Khandelwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. R.</given-names>
            <surname>Chikka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Karlapalem</surname>
          </string-name>
          ,
          <article-title>Biomedical ner using novel schema and distant supervision</article-title>
          ,
          <source>in: Proceedings of the 21st Workshop on Biomedical Language Processing</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>155</fpage>
          -
          <lpage>160</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Ameer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zuo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>Zero-shot clinical entity recognition using chatgpt</article-title>
          ,
          <source>arXiv preprint arXiv:2303.16416</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>OpenAI</surname>
          </string-name>
          , Chatgpt,
          <year>2022</year>
          . URL: https://chat.openai.com, accessed:
          <fpage>2024</fpage>
          -06-10.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>T. B. Brown</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ryder</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Subbiah</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Dhariwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Neelakantan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Shyam</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Sastry</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Askell</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Herbert-Voss</surname>
            , G. Krueger,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Henighan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Child</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Ramesh</surname>
            ,
            <given-names>D. M.</given-names>
          </string-name>
          <string-name>
            <surname>Ziegler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Winter</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Hesse</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            , E. Sigler,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Litwin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Chess</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Berner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>McCandlish</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Radford</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Amodei</surname>
          </string-name>
          ,
          <article-title>Language models are few-shot learners</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2005</year>
          .14165.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tripathy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bansal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Choudhary</surname>
          </string-name>
          ,
          <article-title>Improving biomedical named entity recognition through transfer learning and asymmetric tri-training</article-title>
          ,
          <source>Procedia Computer Science</source>
          <volume>218</volume>
          (
          <year>2023</year>
          )
          <fpage>2723</fpage>
          -
          <lpage>2733</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>A.</given-names>
            <surname>Miranda-Escalada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gascó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lima-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Farré-Maduell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Estrada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nentidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          , G. Katsimpras, G. Paliouras,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Krallinger, Overview of distemist at bioasq: Automatic detection and normalization of diseases from clinical texts: results, methods, evaluation and multilingual resources</article-title>
          .,
          <source>in: CLEF (Working Notes)</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>179</fpage>
          -
          <lpage>203</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lima-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Farré-Maduell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gascó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nentidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          , G. Katsimpras, G. Paliouras,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Krallinger, Overview of medprocner task on medical procedure detection and entity linking at bioasq 2023</article-title>
          ., in: CLEF (Working Notes),
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lima-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Farré-Maduell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rodríguez-Miret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krallinger</surname>
          </string-name>
          , MultiCardioNER Corpus:
          <article-title>Multilingual Adaptation of Clinical NER Systems to the Cardiology Domain</article-title>
          ,
          <year>2024</year>
          . URL: https: //doi.org/10.5281/zenodo.11368861. doi:
          <volume>10</volume>
          .5281/zenodo.11368861.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>