<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Conference and Labs of the Evaluation Forum, September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Identifying Cardiological Disorders in Spanish via Data Augmentation and Fine-Tuned Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Antonio Romano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giuseppe Riccio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Postiglione</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vincenzo Moscato</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Naples Federico II, Department of Electrical Engineering and Information Technology (DIETI)</institution>
          ,
          <addr-line>Via Claudio, 21 - 80125 - Naples</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>0</volume>
      <fpage>9</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>This study presents a novel approach to Biomedical Named Entity Recognition (BioNER), specifically tailored for the cardiology domain. The challenge of adapting models to specific fields is addressed through the integration of cross-domain transfer learning and data augmentation techniques. The process begins with the fine-tuning of a compact Biomedical Transformer model on the DisTEMIST corpus, enabling the capture of general biomedical concepts. This model is then further trained on the CardioCCC corpus, a cardiology-specific dataset, enhancing its ability to identify and interpret cardiological entities. A data augmentation strategy then is employed, leveraging Context Similarity and K-Nearest Neighbors (KNN) to generate augmented datasets. This enhances the model's ability to recognize medical entities. The final step involves a NER Fusion strategy, which combines outputs from multiple BioNER taggers to bolster robustness and accuracy in entity recognition. Experimental results from the MultiCardioNER challenge demonstrate the efectiveness of the proposed approach. Our framework surpasses the median F1 Score of 0.7566 by approximately 4%, achieving a score of 0.791, which is only 2% lower w.r.t. the top submission, despite being based on much smaller language models.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Biomedical Named Entity Recognition</kwd>
        <kwd>Data Augmentation</kwd>
        <kwd>Language Models</kwd>
        <kwd>EHRs</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In recent years, given the increasing volume of clinical data generated by medical personnel and the
evolution of Artificial Intelligence (AI) models, it has become necessary to adopt techniques for the
automatic extraction of medical concepts in order to support the development of personalized insights
useful for patient health.</p>
      <p>
        Specializing pre-trained BioNER models from general medical domains to specific fields like cardiology
presents significant challenges due to limited specialized data availability, as highlighted by Nguyen
et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and Chen et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Transfer learning is a pivotal method for enhancing model performance in
specific domains, as shown by Sasikumar and Mantri [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and Zhou et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], who adapted pre-trained
biomedical models to specialized areas. Nevertheless, this approach is often insuficient due to the
complexity of domain-specific texts.
      </p>
      <p>
        Our approach attempts to address this problem by generating new data that increases the presence
of less frequent medical entities by replacing them with similar medical entities. Therefore, for the
identification of novel medical entities within the specified domain, it is necessary to establish a
substitution strategy that, in contrast to other methodologies (Phan and Nguyen [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]; Ghosh et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]),
exploits the contextual similarity of the sentence in which the entity is to be augmented.
      </p>
      <p>
        The proposed annotation methodology includes, in addition to data augmentation, a late fusion
mechanism that leverages the use of various pre-trained models in the medical domain, similar to the
work proposed by Sun and Bhatia [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and fine-tuned with cardiology data. This mechanism aims
to improve the robustness and coverage of the generated annotations, as these models, trained on
heterogeneous data, allow our system to recognize a greater number of medical entities through their
combination.
      </p>
      <p>
        We experimented our approach within the first track of the MultiCardioNER1 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] challenge, part of
the BioASQ 2024 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] workshop. Specifically, we employed a diverse range of pre-trained models, each
ifne-tuned on combinations of the DisTEMIST dataset [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] as well as a new dataset (CardioCCC) of
cardiology clinical cases annotated using the same guidelines. Our method surpassed the median F1
Score of 0.7566 by approximately 4% to achieve a score of 0.791. Interestingly, our score is close to the
winning submission (only 2% lower) despite being based on much smaller transformer architectures.
      </p>
      <p>The remainder of this paper is structured as follows: Section 2 discusses the existing literature related
to our work; in Section 3 we outline the scope and objectives of our study; Section 4 presents the
datasets used in our framework and their main characteristics. In Section 5, we detail our method, while
experiments are presented in Section 6, discussing the results and their implications within the context
of our research objectives. Finally, Section 7 summarizes the contributions of this paper and suggests
avenues for future research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>Adapting BioNER to specific medical domains, such as cardiology, presents significant challenges due
to the complexity and variety of medical language.</p>
      <p>
        Transfer learning has proven to be an essential method in the field of cross-domain BioNER for
enhancing model resilience with respect to the medical concepts of specific domains. For example,
Sasikumar and Mantri [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] leverage pre-trained models on biomedical corpora to adapt them to specific
medical domains. Similarly, Zhou et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] utilized transfer learning to leverage pre-trained features of
general medicine models to improve the accuracy of specialized NER systems in clinical records.
      </p>
      <p>However, despite the efectiveness of transfer learning, there are significant challenges. One of these
is the adaptation of general clinical concept recognition systems to cardiology, a domain with unique
complexity and specificity. Transfer learning alone may not be suficient to address the challenges
associated with domain-specific NER, due to the diversity and complexity of biomedical texts. To bridge
this gap, our study proposes an approach that integrates transfer learning with Data Augmentation.</p>
      <p>
        Therefore, extensive research has been conducted on strategies to increase text data in order to
solve the issue of the lack of manually annotated data in specific medical domain (e.g. cardiological
area). For example, Bartolini et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] propose COSINER, which generates distinct increased data by
using the contextual replacement of entities. Furthermore, Ghosh et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] presented BioAug, which
conditionally generates augmented data using the BART model to guarantee factual accuracy and
diversity. Another approach for entity replacement proposed by Phan and Nguyen [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], creates new
sentences by substituting entities with semantically equivalent ones using Gazetteer terms.
      </p>
      <p>
        Following the cross-domain phase, to improve the robustness and coverage of the BioNER models, we
utilize the merging of BioNER taggers. Sun and Bhatia [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] proposed merging the results to manage tag
overlap and improve the complete concept extraction. Our approach, takes inspiration from the
abovementioned method, merging the results generated by diferent BioNER taggers to increase coverage,
relying on these research results to ensure a more complete and accurate extraction of entities. In
addition, merging taggers involves handling overlapping tags and conflicting results, ensuring that the
ifnal output is more precise and coherent.
      </p>
      <sec id="sec-2-1">
        <title>Our contribution</title>
        <p>
          In the proposed framework, we have adopted four pre-trained models. In particular, the basic models are
those of Carrino et al. [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] and Carrino et al. [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], which have demonstrated excellent results in BioNER
1MultiCardioNER challenge website: https://temu.bsc.es/multicardioner/
in medical texts in Spanish. The other two models have been trained from the previous ones using wider
medical datasets, such as those from Cohen et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] and Llanos et al. [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], thus enriching knowledge
of basic models and improving recognition of a greater number of medical entities. Subsequently, we
implemented a phase of data augmentation on the CardioCCC cardiology-specific dataset in order to
increase the number of medical concepts useful for cross-domain transfer learning, inspired by the
entity replacement technique proposed by Phan and Nguyen [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Finally, our approach involves merging
the predictions made by the diferent BioNER taggers, following an overlapping management strategy
between the various annotations, inspired by the merging technique described by Sun and Bhatia [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Problem formulation</title>
      <p>
        Starting from a dataset of annotated sentences denoted as   = {(x, y) ∈ X ×
Y}, where:
• X is the collection of all sentences.
•  ∈ {1, . . . ,  },  representing the total number of sentences in the dataset and the  representing
the i-th sentence.
• Each sentence x is a sequence of tokens  ∈ x where  ∈ {1, . . . , } and  is the length of
the sentence.
• Y is the set of possible labels. We use the IOB2 annotation scheme [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], thus Y = {B, I, O},
where B marks the beginning of an entity, I marks the inside of an entity, and O marks tokens
outside any entity.
      </p>
      <p>• y assigns each token  ∈ x to its corresponding label  .</p>
      <p>The objective of the BioNER model is to precisely assign the appropriate tag from Y to each token
within a given input sentence.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Materials</title>
      <p>
        In our study, we utilized data provided by the MultiCardioNER challenge, which encompasses various
clinical domain corpora, including DisTEMIST and the smaller CardioCCC corpus. Specifically, the
DisTEMIST corpus underwent manual annotation by clinical experts, following specific guidelines for
annotating diseases in Spanish clinical cases, as outlined in the work of Miranda-Escalada et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
These guidelines were meticulously developed by clinical experts through multiple cycles of quality
control and consistency analysis before the entire dataset received annotations.
      </p>
      <p>The training set for recognizing designated entities within the DisTEMIST corpus comprises 1000
recorded clinical cases. Simultaneously, a similar procedure was applied to 508 documents related to
cardiological clinical cases within the Corpus CardioCCC.</p>
      <p>To enhance the annotated dataset, we leveraged the DisTEMIST gazetteer, which contains key terms
and synonyms for clinical entities. This tool significantly improves coverage of terminological and
semantic variations in cardiological clinical texts using similarity-based approaches, thereby enhancing
the quality and accuracy of annotations.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Methodology</title>
      <p>Figure 1 shows an overview of the methodological flow of our solution for the MultiCardioNER track.
Starting with the DisteMIST () corpus training set, annotated according to the guidelines explained
previously, it has been adequately preprocessed to build a new dataset containing the document identifier
, the tokens  representing it, and the tags  associated with the token, using the properly mapped
BIO scheme. The need to tokenize clinical text sentences x into single tokens  arises from the
limited number of input samples accepted by each model used. The same process is applied to prepare
CardioCCC () for the next fine-tuning phase.</p>
      <p>DisTEMIST</p>
      <p>Corpus</p>
      <p>Embedding</p>
      <p>Gazetteer
of DisTEMIST</p>
      <p>Data Augmentation</p>
      <sec id="sec-5-1">
        <title>5.1. Cross-domain transfer learning</title>
        <p>We propose an innovative cross-domain transfer learning solution to enhance disease recognition in
cardiology. Our approach leverages a Biomedical Transformer Backbone, which is fine-tuned on various
corpora provided by the challenge, to achieve superior predictive performance.</p>
        <p>Initial Fine-Tuning on DisTEMIST We start by fine-tuning the Biomedical Transformer Backbone
using the DisTEMIST corpus . This initial step tailors the model to understand the general biomedical
language and disease entities present in this dataset. This corpus provides a broad foundation, enabling
the model to capture essential biomedical concepts and terminologies.</p>
        <p>Transfer learning on CardioCCC The fine-tuned model is then trained on the CardioCCC ( )
corpus to generate the first set of predictions. This step allows the model to adapt its understanding
specifically to cardiological contexts and terminologies. Focusing on this specialized corpus ensures
that the model can recognize and interpret data relevant to cardiology. In fact, it generates a second set
of predictions. This ensures that the model integrates the cardiology-specific knowledge more deeply.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Data Augmentation with Context Similarity</title>
        <p>Frequency Study Prior to implementing data augmentation, a systematic Frequency Study was
performed to identify underrepresented entities and contexts within the CardioCCC dataset (). This
analysis involved examining the frequency of each medical mention in the CardioCCC dataset ().
Following this analysis, a threshold was determined, corresponding approximately to the knee of the
curve (Figure 2) that illustrates the distribution of word frequencies within the dataset. This threshold
represents a balance point: words with frequencies above this threshold are suficiently common and
do not necessitate augmentation, while those with frequencies below are infrequent and can benefit
from augmentation. This approach ensures that the augmentation process specifically targets these
deficiencies, thereby optimizing the benefits of the additional data.</p>
        <p>Through this analysis, we identified the entities that should be replaced to enhance the diversity and
comprehensiveness of the CardioCCC dataset (). This enhancement improves the model’s
generalization capabilities, enabling it to recognize a broader spectrum of disease entities and ultimately achieve
higher accuracy.</p>
        <p>Entity Replacement using Context Similarity with KNN With the insights gained from the
Frequency Study, we leverage the abundant information provided by the Gazetteer () to fill gaps in
the CardioCCC () dataset, thereby enhancing its overall quality and utility for training the Biomedical
Transformer Backbone fine-tuned using DisTEMIST ( ). To accomplish this objective, we employed
a dataset augmentation technique utilizing Context Similarity with K-Nearest Neighbors (KNN) — K
being set to  = 1 —, as illustrated in Figure 3.</p>
        <p>This approach involves calculating the similarity between the embeddings of sentences in the
CardioCCC dataset (x) and the embeddings of entities in the Gazetteer (e). By targeting
sentences in CardioCCC () annotated with B and I tags from the BIO Scheme, we identify the most
contextually similar entities in the Gazetteer  and replace the original entities in the CardioCCC
sentences X with these similar entities obtaining X^ .</p>
        <p>To formalize Data Augmentation phase, we utilize a Gazetteer denoted as  = {(e, ^y) ∈ E ×
Y^ }, where E represents the collection of all entities in the Gazetteer, e is the -th entity, with  ∈
{1, . . . ,  }, and Y^ ∈ {, } represents the set of labels assigned to e. Here,  denotes the total
number of entities in the Gazetteer.</p>
        <p>Subsequently, we employ Context Similarity (), computed using the K-Nearest Neighbors (KNN)
function between the embeddings (xi) = xi and (ei) = ei. The Context Similarity () is
defined as:
 :   (xi, ei)
(1)
where  represents the top-similar entities from  that are candidates for augment sentences. The
augmented sentences X^ are formulated as:</p>
        <p>X^ : {x^i = (xi) | ∀xi ∈ X}
Consequently, the augmented dataset () is expressed as:</p>
        <p>: {(x^i, ) ∪ (xi, ) | ∀xi ∈ X, ∀x^i ∈ X^ , ∀ ∈ Y}
where X and Y denote the original sets of sentences and their corresponding labels, respectively.</p>
        <p>This method augments the dataset by merging both the original sentences (X) and contextually
similar sentences (X^ ), as can be seen from the flow of the data augmentation process shown in Figure 4.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Transfer learning on CardioCCC Augmented Corpus</title>
        <p>The Biomedical Transformer Backbone, developed on DisTEMIST (), is further trained on the
CardioCCC Augmented () dataset. This final training phase allows the model to generate a third set of
predictions, benefiting from the increased diversity and richness of the data.</p>
        <p>Through this strategy, we are able to enhance the model’s ability to generalize and recognize diseases
more accurately. By initially fine-tuning on the DisTEMIST ( ) corpus, the model gains a broad
understanding of biomedical language, which is essential for accurate disease recognition. Training on
the enriched CardioCCC () corpus further refines the model to focus on cardiological data, ensuring
its predictions are contextually relevant and precise.</p>
      </sec>
      <sec id="sec-5-4">
        <title>5.4. BioNER Fusion</title>
        <p>To enhance the coverage of entities extracted from clinical notes, we merge the annotations generated
by diferent Biomedical Transformer Backbones. However, during the BioNER Fusion   
phase, it is essential to define merging strategies to handle any overlapping annotations. To achieve
this, we establish a priority level based on the predictive performance of the models, allowing us to
correctly select the annotation in case of conflicts. Initially, we perform a fusion operation to remove
duplicate extracted entities that are entirely overlapping.
(2)
(3)</p>
        <p>Original Sentence from CardioCCC
The Patient showed symptoms of angina and was advised to undergo</p>
        <p>an ECG
Context Similarity</p>
        <p>Gazetteer
Relevant Entities from the Gazetteer</p>
        <p>Angina:
{"chest pain", "coronary artery disease", "myocardial infarction"}</p>
        <p>ECG:
{"electrocardiogram", "EKG", "heart monitor"}</p>
        <sec id="sec-5-4-1">
          <title>CardioCCC Corpus</title>
          <p>+
Selecting the number of entity with
the highest contextual similarity</p>
        </sec>
        <sec id="sec-5-4-2">
          <title>Data Augmented</title>
          <p>Augmented Sentence
The patient showed symptoms of chest pain and was advised to</p>
          <p>undergo an electrocardiogram.</p>
          <p>For managing the Fusion of NER tagger generated by cross-domain transfer learning, we handle the
overlapping with this function:
 : min( , ′ ) −
max( , ′ ) ≥
0
where  and  represent the End Span and Start Span of the  entity by the first model, while ′
and ′ refer to the second model.</p>
          <p>Sometimes, the overlap is not complete, but the “start span” and “end span” of one entity partially
coincide with those of another entity (even if not identical), resulting in  &gt; 0.</p>
          <p>In essence, the strategy of BioNER Fusion    is defined as:
   =
⎧
⎨</p>
          <p>:  ≥ 0
⎩ (, ′ ) :  &lt; 0</p>
          <p>In such scenarios, overlapping is resolved by prioritizing the entity with the higher priority level
  . The priority scheme, assigned to the models, is fixed according to the performance of the models
observed on the internal test set. For example, the model with priority level 1 has a higher priority than
the model with priority level 2. This approach ensures that all extracted entities are retained in the final
clinical note, thereby enhancing the overall entity extraction process.
(4)
(5)</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Experiments</title>
      <p>The performance of the proposed approaches for BioNER was assessed by participating in the
MultiCardioNER Shared Task2 as part of the BioASQ 2024 challenge. This section presents the results of
our methodology on the final test set, along with preliminary experiments conducted on the training
corpus provided by the challenge organizers.</p>
      <sec id="sec-6-1">
        <title>6.1. Experimental Setup</title>
        <p>6.1.1. Evaluation Metrics
The evaluation is performed by comparing the automatically generated results with those produced by
the manual annotations of experts. The primary evaluation metrics for track 1 include micro-averaged
precision (MiP), recall (MiR), and the F1 score (MiF1). For the evaluation of the results, the library3
realized by the organizers was used.
6.1.2. Configuration
The BioNER system was implemented using the HuggingFace Transformers library (v4.40.2)
by exploiting the various Spanish Transformer biomedical networks in the repository. In the Table 1
are shown those selected for the experiment.</p>
        <p>These models were chosen not only for their efectiveness but also for their relatively small size,
making them executable even on less powerful hardware and thus suitable for low-resource environments.
We fine-tuned our models in a Google Colab environment, which provided us with a Tesla T4 GPU.</p>
        <p>In the phase prior to our submission, we studied the efects of various hyperparameters and the
generalization error of our models by dividing the original corpus of clinical cases into three parts: (1) a
2MultiCardioNER challenge website: https://temu.bsc.es/multicardioner/
3MultiCardioNER Evaluation Library: https://github.com/nlp4bia-bsc/multicardioner_evaluation_library
4https://huggingface.co/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
5https://huggingface.co/lcampillos/roberta-es-clinical-trials-ner
6https://huggingface.co/PlanTL-GOB-ES/bsc-bio-ehr-es
7https://huggingface.co/StivenLancheros/roberta-base-biomedical-clinical-es-finetuned-ner-CRAFT
training set (60% of the original corpus) used to train the model, (2) a validation set (20% of the original
corpus) to evaluate the efects of the hyperparameters, and (3) a test set (20% of the original corpus) to
assess the models’ ability to generalize to unseen data (Internal Test Set Results).</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Results</title>
        <p>To conduct the experiments, we initially analyzed how variations in hyperparameters influenced our
validation set. Firstly, we adjusted the batch size, determining that an optimal size was 4. Subsequently,
we tuned the learning rate, discovering that 8e-5 yielded the best results. Finally, we modified the weight
decay, concluding that a value of 0.2 was optimal. After identifying these hyperparameters, we increased
the training epochs and implemented an early stopping criterion to halt training if performance on the
validation set did not improve for five consecutive epochs. Further details on hyperparameters tuning
are provided in the Appendix A.
6.2.1. Cross-domain BioNER Evaluation
To construct the optimal cross-domain transfer learning model, we conduct an Internal Test to select the
best combination of pre-trained models used at various layers of the cross-domain process as shown in
Table 2.</p>
        <p>The best results are shown in bold and were selected in the MultiCardioNER Test Results released by
the challenge organizer, as shown in Table 3.</p>
        <p>The model bsc-bio-ehr-es, trained on CardioCCC (), achieved the best results in the external test
with an F1 score of 0.7924, due to the specificity of the CardioCCC dataset, which includes terminology
and concepts related to cardiology. In comparison, models pre-trained on DisTEMIST () and both the
combination of DisTEMIST and CardioCCC ( ∪ ), as well as DisTEMIST and CardioCCC Augmented
( ∪ ) showed lower performance. This is attributed to the more general nature of the DisTEMIST
dataset, which fails to efectively capture the specialized cardiology terms present in the test sets.
Therefore, we are also considering the proposed model fusion approach.
dataset on which the pre-trained model was trained and comparison with results of the Challenge
6.2.2. BioNER Fusion Evaluation
We evaluate the impact of Fusion applied to the best combinations set previously. In Table 4, it is evident
that the most promising results stem from the top submission presented during the competition.</p>
        <p>Specifically, the BioNER Fusion (     ) performed on the combination of CardioCCC and
bscbio-ehr-es model ( ), DisTEMIST + CardioCCC with r-es-clinical-trials-ner (</p>
        <p>CardioCCC Augmented with r-es-clinical-trials-ner (
∪
) exhibited superior predictive performance.</p>
        <p>), and DisTEMIST +
∪
This integration, facilitated through our fusion strategy, yielded an enhancement in Recall (MiR),
showcasing the system’s heightened ability to accurately identify relevant entities. This outcome
implies that fusion enabled the system to ofset individual model deficiencies, thereby contributing
to an overall improvement in entity extraction efectiveness. Therefore, the fusion coupled with the
application of String Matching Cutter exhibits high Precision but low Recall, indicating a conservative
tendency of the system to recognize only highly probable entities. Conversely, the integration of String
Matching Adder with the fusion is characterized by greater inclusivity, even if at the expense of lower
Precision. In conclusion, examining the overall results of the challenge reveals that the leading models
(e.g. mdeberta, XLM-RoBERTa, CLIN-X-ES, ...) utilized are at least 3-4 times larger than those employed
in our approach. Despite this, the best result achieved by our system nearly matched the performance
of the top models, with an F1 score of 0.791 compared to approximately 0.82. This demonstrates
that our approach is not only efective but also more eficient in terms of computational resources,
making it ideal for practical implementations with hardware constraints. Additionally, the fusion of
models from various datasets (CardioCCC, DisTEMIST, and augmented datasets) has demonstrated
the system’s capability to integrate and balance information from diverse sources, thereby enhancing
overall performance and flexibility.</p>
      </sec>
      <sec id="sec-6-3">
        <title>6.3. Error Analysis</title>
        <p>
          Inspired by Moscato et al. [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], we conducted a detailed error analysis of the diferences among the
models by examining the number of correctly retrieved entity mentions (Correct). These errors can be
categorized into three possible distinct types:
• Complete False Positive (CFP): The model identifies an entity that is not annotated as a named
entity.
• Complete False Negative (CFN): The model fails to identify an entity that is annotated as a
named entity.
• Right Label Overlapping Span (RLOS): The model correctly identifies the presence of an
annotated named entity, but the span of the entity is incorrect.
        </p>
        <p>This categorization allowed us to better understand the strengths and weaknesses of our system. The
results are shown in Table 5.</p>
        <p>The error analysis corroborates previous evaluations, indicating that the fusion combined with string
matching significantly reduces the number of false positives but drastically increases the number of
false negatives, as it extracts only half of the relevant entities. The possible causes of these results may
lie in the quality and size of the Gazetteer used. The best balance between precision and recall, which
most efectively satisfies this analysis, is once again achieved by
   (, ∪, ).</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>In this study, we presented an innovative approach to address the challenge of BioNER Fusion in
the biomedical domain, with a particular focus on cardiology. Our methodology integrates data
augmentation techniques and data fusion mechanisms to enhance the robustness and coverage of the
generated annotations. By utilizing pre-trained models on biomedical corpora and refining them with
domain-specific cardiology data, we achieved significant results, overcoming the limitations related to
the scarce availability of domain-specific data.</p>
      <p>However, there are potential disadvantages to our approach. Data augmentation techniques, while
increasing the diversity of the training data, might also introduce noise and potentially irrelevant
information, which could hinder the model’s performance. Additionally, the complexity of integrating
multiple models through data fusion can increase computational requirements and may pose challenges
in real-time applications.</p>
      <p>The results obtained in the MultiCardioNER competition, part of the BioASQ 2024 challenge,
demonstrate the efectiveness of our approach. The key characteristics of our results include their ecfiacy,
computational eficiency, domain adaptation, flexibility, balance between precision and recall, robustness,
and innovativeness. These combined elements illustrate how our approach can be a valid and practical
solution for entity extraction from biomedical texts, especially in contexts with limited computational
resources. We exceeded the median F1 score by 4%, achieving a score of 0.791. This success highlights the
potential of the proposed techniques in addressing BioNER challenges in specific biomedical contexts,
paving the way for further improvements and applications in various clinical fields.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgements</title>
      <p>We acknowledge financial support from (1) the PNRR MUR project PE0000013-FAIR and (2) the Italian
ministry of economic development, via the ICARUS (Intelligent Contract Automation for Rethinking
User Services) project (CUP: B69J23000270005).</p>
    </sec>
    <sec id="sec-9">
      <title>A. Hyperparameters Tuning</title>
      <p>We analyzed how variations in hyperparameters influence our validation set. Specifically, we have
experimented each model’s batch size, learning rate, and weight decay gradually to examine how well it
performed in terms of precision, recall, and F1 score. We changed the batch size by first choosing from
among potential candidates, and then we selected the value that corresponded to the best performance.
We then experimented with various learning rates after fixing the batch size value; also in this case, we
selected the value that yielded the highest scores. After setting the learning rate and batch size, we
examined a small variation in the rate of weight decay and determined the ideal value based on earlier
logic.</p>
      <p>Batch size During training, we varied the batch size, initially set at 16, and then adjusted it to 8, 4,
and 2. The results, as shown in the table 6 indicate that the optimal batch size is 4.
Learning rate After determining the optimal batch size, we varied the initial learning rate used by
the AdamW optimizer, setting it between 2e-5 and 8e-5. The best results, as indicated in the as indicated
in table 7, show that the optimal combination involves a learning rate of 8e-5.
Weight decay Finally, we adjusted the weight decay applied to all layers except the bias and
LayerNorm weights in the AdamW optimizer, starting with a value of 0.1 and then increased it to 0.2, which
proved to be the best solution, as reported in table 8.</p>
      <p>As a result of these analyses, we determined the optimal hyperparameters as follows: a batch size of
4, a learning rate of 8e-5, and a weight decay of 0.2.</p>
      <p>We selected the value ’5’ for the initial epochs based on preliminary studies indicating that the pattern
tended to converge rapidly. Furthermore, we observed that after the fifth epoch, performance no longer
improved significantly. Therefore, to avoid overtraining and optimize training time, we chose to stop at
the 5 epoch.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N. D.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. L.</given-names>
            <surname>Buntine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Beare</surname>
          </string-name>
          ,
          <article-title>Hardness-guided domain adaptation to recognise biomedical named entities under low-resource scenarios</article-title>
          , in: Y.
          <string-name>
            <surname>Goldberg</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Kozareva</surname>
          </string-name>
          , Y. Zhang (Eds.),
          <source>Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP</source>
          <year>2022</year>
          ,
          <string-name>
            <given-names>Abu</given-names>
            <surname>Dhabi</surname>
          </string-name>
          ,
          <source>United Arab Emirates, December</source>
          <volume>7</volume>
          -
          <issue>11</issue>
          ,
          <year>2022</year>
          , Association for Computational Linguistics,
          <year>2022</year>
          , pp.
          <fpage>4063</fpage>
          -
          <lpage>4071</lpage>
          . URL: https://doi.org/10.18653/v1/
          <year>2022</year>
          .emnlp-main.
          <volume>271</volume>
          . doi:
          <volume>10</volume>
          .18653/V1/
          <year>2022</year>
          .EMNLP-MAIN.
          <year>271</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Aguilar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Neves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Solorio</surname>
          </string-name>
          ,
          <article-title>Data augmentation for cross-domain named entity recognition</article-title>
          , in: M.
          <string-name>
            <surname>Moens</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Specia</surname>
          </string-name>
          , S. W. Yih (Eds.),
          <source>Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP</source>
          <year>2021</year>
          , Virtual Event / Punta Cana, Dominican Republic,
          <fpage>7</fpage>
          -
          <issue>11</issue>
          <year>November</year>
          ,
          <year>2021</year>
          , Association for Computational Linguistics,
          <year>2021</year>
          , pp.
          <fpage>5346</fpage>
          -
          <lpage>5356</lpage>
          . URL: https://doi.org/10.18653/v1/
          <year>2021</year>
          .emnlp-main.
          <volume>434</volume>
          . doi:
          <volume>10</volume>
          .18653/V1/
          <year>2021</year>
          . EMNLP-MAIN.
          <year>434</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Sasikumar</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. S. I. Mantri</surname>
          </string-name>
          ,
          <article-title>Transfer learning for low-resource clinical named entity recognition</article-title>
          , in: T.
          <string-name>
            <surname>Naumann</surname>
            ,
            <given-names>A. B.</given-names>
          </string-name>
          <string-name>
            <surname>Abacha</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Bethard</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Rumshisky (Eds.),
          <source>Proceedings of the 5th Clinical Natural Language Processing Workshop</source>
          , ClinicalNLP@ACL 2023, Toronto, Canada, July
          <volume>14</volume>
          ,
          <year>2023</year>
          , Association for Computational Linguistics,
          <year>2023</year>
          , pp.
          <fpage>514</fpage>
          -
          <lpage>518</lpage>
          . URL: https://doi.org/10. 18653/v1/
          <year>2023</year>
          .clinicalnlp-
          <volume>1</volume>
          .53. doi:
          <volume>10</volume>
          .18653/V1/
          <year>2023</year>
          .CLINICALNLP-
          <volume>1</volume>
          .
          <fpage>53</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <article-title>Ensemble transfer learning on augmented domain resources for oncological named entity recognition in chinese clinical records</article-title>
          ,
          <source>IEEE Access 11</source>
          (
          <year>2023</year>
          )
          <fpage>80416</fpage>
          -
          <lpage>80428</lpage>
          . URL: https://doi.org/10.1109/ACCESS.
          <year>2023</year>
          .
          <volume>3299824</volume>
          . doi:
          <volume>10</volume>
          .1109/ ACCESS.
          <year>2023</year>
          .
          <volume>3299824</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>U.</given-names>
            <surname>Phan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <article-title>Simple semantic-based data augmentation for named entity recognition in biomedical texts</article-title>
          , in: D.
          <string-name>
            <surname>Demner-Fushman</surname>
            ,
            <given-names>K. B.</given-names>
          </string-name>
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Ananiadou</surname>
          </string-name>
          , J. Tsujii (Eds.),
          <source>Proceedings of the 21st Workshop on Biomedical Language Processing</source>
          ,
          <source>BioNLP@ACL</source>
          <year>2022</year>
          , Dublin, Ireland, May
          <volume>26</volume>
          ,
          <year>2022</year>
          , Association for Computational Linguistics,
          <year>2022</year>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>129</lpage>
          . URL: https://doi.org/ 10.18653/v1/
          <year>2022</year>
          .bionlp-
          <volume>1</volume>
          .12. doi:
          <volume>10</volume>
          .18653/V1/
          <year>2022</year>
          .BIONLP-
          <volume>1</volume>
          .
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Tyagi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Manocha</surname>
          </string-name>
          , Bioaug:
          <article-title>Conditional generation based data augmentation for low-resource biomedical NER</article-title>
          , in: H.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>W. E.</given-names>
          </string-name>
          <string-name>
            <surname>Duh</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>M. P.</given-names>
          </string-name>
          <string-name>
            <surname>Kato</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          Poblete (Eds.),
          <source>Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          ,
          <string-name>
            <surname>SIGIR</surname>
          </string-name>
          <year>2023</year>
          , Taipei, Taiwan,
          <source>July 23-27</source>
          ,
          <year>2023</year>
          , ACM,
          <year>2023</year>
          , pp.
          <fpage>1853</fpage>
          -
          <lpage>1858</lpage>
          . URL: https://doi.org/10.1145/3539618.3591957. doi:
          <volume>10</volume>
          .1145/3539618.3591957.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhatia</surname>
          </string-name>
          ,
          <article-title>Neural entity recognition with gazetteer based fusion</article-title>
          , in: C.
          <string-name>
            <surname>Zong</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Xia</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Navigli</surname>
          </string-name>
          (Eds.),
          <article-title>Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021</article-title>
          ,
          <string-name>
            <surname>Online</surname>
            <given-names>Event</given-names>
          </string-name>
          ,
          <source>August 1-6</source>
          ,
          <year>2021</year>
          , volume ACL/
          <article-title>IJCNLP 2021 of Findings of ACL, Association for Computational Linguistics</article-title>
          ,
          <year>2021</year>
          , pp.
          <fpage>3291</fpage>
          -
          <lpage>3295</lpage>
          . URL: https://doi.org/10.18653/v1/
          <year>2021</year>
          .findings-acl.
          <volume>291</volume>
          . doi:
          <volume>10</volume>
          .18653/V1/
          <year>2021</year>
          .FINDINGS-ACL.
          <year>291</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lima-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Farré-Maduell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rodríguez-Miret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rodríguez-Ortega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lilli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lenkowicz</surname>
          </string-name>
          , G. Ceroni,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kossof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nentidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          , G. Katsimpras, G. Paliouras,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Krallinger, Overview of MultiCardioNER task at BioASQ 2024 on Medical Speciality and Language Adaptation of Clinical NER Systems for Spanish, English and Italian</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . García Seco de Herrera (Eds.),
          <source>Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nentidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Katsimpras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lima-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Farré-Maduell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krallinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Loukachevitch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Davydova</surname>
          </string-name>
          , E. Tutubalina, G. Paliouras,
          <source>Overview of BioASQ</source>
          <year>2024</year>
          :
          <article-title>The twelfth BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Maria Di Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>García Seco de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ),
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Miranda-Escalada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gascó</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lima-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Farré-Maduell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Estrada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nentidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          , G. Katsimpras, G. Paliouras,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Krallinger, Overview of distemist at bioasq: Automatic detection and normalization of diseases from clinical texts: results, methods, evaluation and multilingual resources</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanbury</surname>
          </string-name>
          , M. Potthast (Eds.),
          <source>Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum</source>
          , Bologna, Italy, September 5th - to - 8th,
          <year>2022</year>
          , volume
          <volume>3180</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>179</fpage>
          -
          <lpage>203</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3180</volume>
          /paper-11.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>I.</given-names>
            <surname>Bartolini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Moscato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Postiglione</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sperlì</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vignali</surname>
          </string-name>
          ,
          <article-title>COSINER: context similarity data augmentation for named entity recognition</article-title>
          , in: T.
          <string-name>
            <surname>Skopal</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Falchi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lokoc</surname>
            ,
            <given-names>M. L.</given-names>
          </string-name>
          <string-name>
            <surname>Sapino</surname>
          </string-name>
          , I. Bartolini, M. Patella (Eds.),
          <source>Similarity Search and Applications - 15th International Conference, SISAP 2022</source>
          , Bologna, Italy, October 5-
          <issue>7</issue>
          ,
          <year>2022</year>
          , Proceedings, volume
          <volume>13590</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2022</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>24</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>031</fpage>
          -17849-
          <issue>8</issue>
          _2. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -17849-8\_2.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>C. P.</given-names>
            <surname>Carrino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Armengol-Estapé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gutiérrez-Fandiño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Llop-Palao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pàmies</surname>
          </string-name>
          , A. GonzalezAgirre, M. Villegas,
          <article-title>Biomedical and clinical language models for spanish: On the benefits of domain-specific pretraining in a mid-resource scenario</article-title>
          ,
          <source>CoRR abs/2109</source>
          .03570 (
          <year>2021</year>
          ). URL: https: //arxiv.org/abs/2109.03570. arXiv:
          <volume>2109</volume>
          .
          <fpage>03570</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C. P.</given-names>
            <surname>Carrino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Llop</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pàmies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gutiérrez-Fandiño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Armengol-Estapé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Silveira-Ocampo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Valencia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gonzalez-Agirre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Villegas</surname>
          </string-name>
          ,
          <article-title>Pretrained biomedical language models for clinical NLP in spanish</article-title>
          , in: D.
          <string-name>
            <surname>Demner-Fushman</surname>
            ,
            <given-names>K. B.</given-names>
          </string-name>
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Ananiadou</surname>
          </string-name>
          , J. Tsujii (Eds.),
          <source>Proceedings of the 21st Workshop on Biomedical Language Processing</source>
          ,
          <source>BioNLP@ACL</source>
          <year>2022</year>
          , Dublin, Ireland, May
          <volume>26</volume>
          ,
          <year>2022</year>
          , Association for Computational Linguistics,
          <year>2022</year>
          , pp.
          <fpage>193</fpage>
          -
          <lpage>199</lpage>
          . URL: https://doi.org/ 10.18653/v1/
          <year>2022</year>
          .bionlp-
          <volume>1</volume>
          .19. doi:
          <volume>10</volume>
          .18653/V1/
          <year>2022</year>
          .BIONLP-
          <volume>1</volume>
          .
          <fpage>19</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>K. B. Cohen</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lanfranchi</surname>
            ,
            <given-names>M. J.</given-names>
          </string-name>
          <string-name>
            <surname>Choi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Bada</surname>
            ,
            <given-names>W. A. B.</given-names>
          </string-name>
          <string-name>
            <surname>Jr.</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Panteleyeva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Verspoor</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Palmer</surname>
            ,
            <given-names>L. E.</given-names>
          </string-name>
          <string-name>
            <surname>Hunter</surname>
          </string-name>
          ,
          <article-title>Coreference annotation and resolution in the colorado richly annotated full text (CRAFT) corpus of biomedical journal articles</article-title>
          ,
          <source>BMC Bioinform</source>
          .
          <volume>18</volume>
          (
          <year>2017</year>
          )
          <volume>372</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>372</lpage>
          :
          <fpage>14</fpage>
          . URL: https://doi.org/10.1186/s12859-017-1775-9. doi:
          <volume>10</volume>
          .1186/S12859-017-1775-9.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Llanos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Valverde-Mateos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Capllonch-Carrión</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moreno-Sandoval</surname>
          </string-name>
          ,
          <article-title>A clinical trials corpus annotated with UMLS entities to enhance the access to evidence-based medicine</article-title>
          ,
          <source>BMC Medical Informatics Decis. Mak</source>
          .
          <volume>21</volume>
          (
          <year>2021</year>
          )
          <article-title>69</article-title>
          . URL: https://doi.org/10.1186/s12911-021-01395-z. doi:
          <volume>10</volume>
          .1186/S12911-021-01395-Z.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Ramshaw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Marcus</surname>
          </string-name>
          ,
          <article-title>Text chunking using transformation-based learning</article-title>
          , in: D.
          <string-name>
            <surname>Yarowsky</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          Church (Eds.),
          <source>Third Workshop on Very Large Corpora, VLC@ACL</source>
          <year>1995</year>
          , Cambridge, Massachusetts, USA, June 30,
          <year>1995</year>
          ,
          <year>1995</year>
          , pp.
          <fpage>82</fpage>
          -
          <lpage>94</lpage>
          . URL: https://aclanthology.org/W95-0107/.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>C. P.</given-names>
            <surname>Carrino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Silveira-Ocampo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gonzalez-Agirre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gutiérrez-Fandiño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krallinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Villegas</surname>
          </string-name>
          , Spanish biomedical crawled corpus,
          <year>2022</year>
          . URL: https://doi.org/10.5281/zenodo.5513237. doi:
          <volume>10</volume>
          .5281/zenodo.5513237.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Intxaurrondo</surname>
          </string-name>
          ,
          <string-name>
            <surname>Scielo-</surname>
          </string-name>
          spain-crawler,
          <year>2019</year>
          . URL: https://doi.org/10.5281/zenodo.2541681. doi:
          <volume>10</volume>
          . 5281/zenodo.2541681.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Intxaurrondo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pérez-Pérez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. P.</given-names>
            <surname>Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>López-Martín</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Santamaría</surname>
          </string-name>
          , S. de la Peña,
          <string-name>
            <given-names>M.</given-names>
            <surname>Villegas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Akhondi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Valencia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lourenço</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krallinger</surname>
          </string-name>
          ,
          <article-title>The biomedical abbreviation recognition and resolution (BARR) track: Benchmarking, evaluation and importance of abbreviation recognition systems applied to spanish biomedical abstracts</article-title>
          , in: R.
          <string-name>
            <surname>Martínez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Montalvo</surname>
          </string-name>
          , J. C. de Albornoz (Eds.),
          <source>Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval</source>
          <year>2017</year>
          )
          <article-title>co-located with 33th Conference of the Spanish Society for Natural Language Processing (SEPLN</article-title>
          <year>2017</year>
          ), Murcia, Spain,
          <year>September 19</year>
          ,
          <year>2017</year>
          , volume
          <volume>1881</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>230</fpage>
          -
          <lpage>246</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-1881/Overview1.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>M.</given-names>
            <surname>Villegas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Intxaurrondo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gonzalez-Agirre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krallinger</surname>
          </string-name>
          , Mespen_parallel-corpora,
          <year>2019</year>
          . URL: https://doi.org/10.5281/zenodo.3562536. doi:
          <volume>10</volume>
          .5281/zenodo.3562536.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>L.</given-names>
            <surname>Campillos-Llanos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Valverde-Mateos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Capllonch-Carrión</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moreno-Sandoval</surname>
          </string-name>
          ,
          <article-title>CT-EBM-SP - Corpus of Clinical Trials for Evidence-Based-Medicine in</article-title>
          <string-name>
            <surname>Spanish</surname>
          </string-name>
          ,
          <year>2022</year>
          . URL: https://doi.org/10. 1186/s12911-021-01395-z. doi:
          <volume>10</volume>
          .1186/s12911-021-01395-z.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>V.</given-names>
            <surname>Moscato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Postiglione</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sansone</surname>
          </string-name>
          , G. Sperlí,
          <article-title>Taughtnet: Learning multi-task biomedical named entity recognition from single-task teachers</article-title>
          ,
          <source>IEEE J. Biomed. Health Informatics</source>
          <volume>27</volume>
          (
          <year>2023</year>
          )
          <fpage>2512</fpage>
          -
          <lpage>2523</lpage>
          . URL: https://doi.org/10.1109/JBHI.
          <year>2023</year>
          .
          <volume>3244044</volume>
          . doi:
          <volume>10</volume>
          .1109/JBHI.
          <year>2023</year>
          .
          <volume>3244044</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>