<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Clinical Entity Recognition and Linking in Greek Discharge Letters using Multilingual-LLM-Based Multi-Stage System</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bor-Woei Huang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Padova</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We constructed a multi-stage, hierarchical system for the ELCardioCC task of clinical entity recognition and subsequent entity linking within the domain of cardiology. We integrated the capabilities of multilingual generative large language models (LLMs) and BERT encoder architectures across diferent phases of our Named Entity Recognition (NER) and Entity Linking (EL) pipeline. In the initial NER phase, we designed zero-shot prompts to instruct the LLMs in the extraction of relevant clinical mentions directly from the Greek discharge letters. These prompts also guided the models in generating accurate English translations of the identified Greek terms and in producing concise biomedical entity descriptions associated with these mentions. Further, to refine the initial set of extracted entities and enhance the overall precision of our NER results, we employed a BERT bi-encoder as a sophisticated filtering mechanism, designed to identify and remove likely false positives. Then, for the EL phase, we utilized a BERT cross-encoder as the core linking component. This model took both the previously extracted clinical mentions and their generated biomedical entity descriptions as input to establish accurate mappings to standardized concepts within the ICD-10 knowledge base. Finally, the linked ICD-10 codes obtained from the EL phase were collected for the MLC-X task. Our best system achieved an F1 score of 0.5761 on the NER task, 0.5336 on the EL task, and 0.7543 on the MLC-X task.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Clinical Named Entity Recognition</kwd>
        <kwd>Clinical Entity Linking</kwd>
        <kwd>Multilingual Language Model</kwd>
        <kwd>BERT Encoder</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Medical records can be made more statistically analyzable by transforming unstructured clinical text
into standardized, searchable codes. This can support medical research, enable extensive retrospective
studies, and uncover cause-and-efect links between illnesses and symptoms, all of which contribute to
a better understanding of illnesses and their treatments. Moreover, structured data from correct coding
can help doctors find important information about a patient’s history, symptoms, and conditions more
quickly. This can help them make diferential diagnoses and give more personalized care.</p>
      <p>
        Early Natural Language Processing (NLP) research heavily concentrated on English text in general
contexts. However, the processing of non-English clinical data has experienced a notable increase in
development in recent years, supplied by the growing availability of varied datasets and advanced
multilingual language models. The ELCardioCC competition [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], within the BioASQ 2025 challenge
[
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ], specifically addresses this by focusing on automatically extracting clinical entities from Greek
discharge summaries. This includes identifying the chief complaint, diagnosis, prior medical history,
medications, and cardiac echo mentions. A subsequent task involves linking these extracted mentions to
their corresponding ICD-10 codes, which provide a universal standard for classifying health conditions.
The automation of the linking process ofers improved collaboration and data sharing by ensuring
consistency in the exchange of medical information across various healthcare systems, regions, and
countries.
      </p>
      <p>
        While numerous pre-trained language models (PLMs) exist for various languages beyond English,
research in NER has frequently focused on fine-tuning these models with limited biomedical and clinical
data to create domain-specific solutions, particularly in low-resource scenarios [
        <xref ref-type="bibr" rid="ref4 ref5 ref6 ref7 ref8">4, 5, 6, 7, 8</xref>
        ]. In contrast,
EL in the medical domain for languages other than English has received considerably less attention, may
due to the lack of corresponding language-specific knowledge base and readily available pre-trained
langauge models. With most existing work stemming from specific challenges like CLEF eHealth,
IberLEF, BioASQ/DisTEMIST, and DEFT, primarily focuses on Spanish and French. To the best of our
knowledge, the application of EL to Greek clinical text remains an unexplored area.
      </p>
      <p>Multilingual language models (MLMs) are useful tools, trained on enormous datasets encompassing
a wide array of languages. This inherent capability to understand Greek text is particularly beneficial
for the NER task on Greek clinical discharge letters. However, while MLMs excel as generalists in
language comprehension and generation across diverse languages, they are not inherently medical
domain experts. This lack of specialized clinical knowledge presents significant challenges, as medical
terminology is replete with nuances, including an abundance of synonyms, context-dependent meanings,
and specialized abbreviations. A general MLM, without explicit clinical training, often struggles to
accurately disambiguate these terms. To address these complexities, we adopted an integrated approach.
We leveraged several MLMs to tackle the challenges of cross-lingual NER, aiming to identify clinical
entities in Greek. Following this initial clinical entity recognition phase, we then employed a
twostage hybrid retrieval approach. This approach was designed to establish a connection, linking the
identified clinical entities from the Greek text to their corresponding standardized concepts within the
comprehensive ICD-10 knowledge base.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        There are primarily two approaches to handing the NER and EL tasks: information retrieval system and
multitask framework systems. Information retrieval approach system retrieves similar instance data
from knowledge bases, based on lexical overlap or semantic similarity [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]. Hierarchical framework
NER and EL systems that employ distinct models for each task, with the first model dedicated to
identifying and classifying entities, and a subsequent, separate model responsible for linking these
identified mentions to a knowledge base [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Hierarchical NER and EL systems with separate models for
NER and EL are conceptually straightforward, however, they are susceptible to error propagation. This
occurs because errors made in the initial NER stage directly impact the performance of the downstream
EL stage. If an entity is incorrectly identified, missed entirely, or has incorrect boundaries assigned
by the NER model, the followup EL model will struggle to link it to the correct knowledge base entry,
leading to a cascade of errors throughout the pipeline.
      </p>
      <p>
        Multitask framework systems come in two main forms, joint model and end-to-end model, to reduce
the error propagation issues in the hierarchical framework. Joint models combine two models, NER and
EL tasks are performed in parallel by a single transformer model [
        <xref ref-type="bibr" rid="ref12 ref13 ref14 ref15">12, 13, 14, 15</xref>
        ]. End-to-End models
take raw text as input and directly output linked entities, without a clearly defined intermediate NER
stage. This forces the model to learn both identification and linking in a unified manner [16].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Dataset</title>
        <p>In ELCardioCC challenge, the training dataset consists of 1,000 Greek-language discharge summaries,
while the test dataset includes an additional 500 discharge letters. These clinical documents contain
complex information detailing patients’ diseases, symptoms, diagnoses, therapeutic interventions, and
clinical outcomes. The training corpus has been manually annotated to identify the exact spans of key
biomedical mentions. Each annotated entity is linked to its corresponding code in the 10th revision of the
International Classification of Diseases (ICD-10), a globally recognized medical taxonomy maintained
by the World Health Organization (WHO)1.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. System</title>
        <p>The pipeline for the cardio discharge letter NER and EL is illustrated in Figure 1. For the NER task,
we prompted LLMs Gemma-3, Phi-4 and Gemini to retrieve clinical mentions and generate English
translations and their descriptions. Potential false positives were then discarded by a NER filter. Moving
to the EL task, the English-translated mentions from the NER phase were combined with their generated
clinical entity descriptions and fed into the entity linker to find their corresponding ICD-10 codes.
Finally, for the MLC-X task, we simply collected the ICD-10 codes obtained during the EL phase.</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Named entity recognition phase</title>
          <p>Multilingual large language models: Gemma-3, Phi-4 and Gemini are trained on multilingual
corpora, can be used to understand Greek discharge letters and extract clinical mentions. During the
NER phase, clinical terms in Greek, such as disease names, symptoms, and treatment references, were
directly translated to their standardized English equivalents. Our approach is prompt engineering,
where the model’s outputs were guided by constructed instructions that correspond to the clinical
domain. These prompts were designed to maximize recall without the need for supervised training.
Zero-shot prompt: We employed two separate prompts (detailed in Table 1) to extract medical
mentions from the Greek discharge letters. Prompt 1 aimed for broad coverage, seeking to identify all
medical terms and phrases present in the text. Prompt 2, however, specifically targeted mentions related
to diseases, syndrome, and their treatments. The output from the prompt 1 achieves high recall but
includes many general medical terms not directly relevant to the cardiovascular diseases or symptoms,
necessitating a subsequent filtering step. Conversely, prompt 2 was designed to yield mentions that
were more directly relevant to the entities we are seeking to extract.</p>
          <p>Given Greek text {discharge letter}. Extract the medical terms from this text and
write English translation and concise descriptions of them in JSON format (key is
the Greek medical term, value is English translation and description pair tuple).</p>
          <p>Given Greek text {discharge letter}. Extract the disease, syndrome, and treatment
terms from this text and write English translation and concise descriptions of them
in JSON format (key is the Greek medical term, value is English translation and
description pair tuple).</p>
          <p>The quantity of text provided to a language model influences the identification of medical terms
within Greek discharge letters. To ensure comprehensive retrieval of all relevant mentions, we
employed a dual strategy: First, we processed the entire discharge letter as a single unit, which
enables the language model to grasp the broader connections and interdependencies among various
medical mentions throughout the document. Second, we processed the letters section by section
(e.g., ΙΣΤΟΡΙΚΟ – ΑΝΤΙΚΕΙΜΕΝΙΚΗ ΕΞΕΤΑΣΗ (HISTORY – OBJECTIVE EXAMINATION), ΑΙΤΙΑ
ΕΙΣΟΔΟΥ-ΑΝΤΙΚΕΙΜΕΝΙΚΗ ΕΞΕΤΕΤΑΣΗ-ΙΣΤΟΡΙΚΟ (REASON FOR ADMISSION – OBJECTIVE
EXAMINATION - HISTORY), ΠΟΡΕΙΑ ΝΟΣΟΥ (COURSE OF DISEASE), ΕΡΓΑΣΤΗΡΙΑΚΕΣ ΕΞΕΤΑΣΕΙΣ
(LABORATORY TESTS) etc.), afterward collected the extracted terms from each individual section. This
segmented processing allows the model to concentrate on the immediate context and pick out specific
mentions within each part.</p>
          <p>Beyond identifying clinical mentions in the Greek text, our prompts also directed the LLMs to produce
both English translations and brief descriptions for each extracted entity. These generated English
translations and descriptions were specifically created to support the subsequent entity linking process.
Table 2 provides an illustrative example of the LLM’s output for a segment of a Greek discharge letter,
displaying the extracted Greek mentions along with their corresponding English translations and
generated descriptions.</p>
          <p>Entity filter This step focuses on refining the initial set of extracted medical terms from the
multilingual language models by removing likely false positives. To enhance the accuracy of these initial
extractions, we filter out less relevant mentions by assessing their semantic similarity to ICD-10 English
disease and symptom category definitions. This filtering employs a BERT bi-encoder to identify and
eliminate irrelevant extracted mentions. Specifically, a MedEmbed BERT bi-encoder 2 [17] generates
vector embeddings for each English-translated biomedical term previously identified in the Greek
discharge letter. This bi-encoder independently processes each English-translated biomedical term,
creating a dense, context-aware representation of its meaning. Similarly, the ICD-10 English category
2https://huggingface.co/abhinand/MedEmbed-large-v0.1
definitions are also embedded using the same BERT bi-encoder. For each extracted English-translated
term, we calculate the cosine similarity between its embedding and the embeddings of the ICD-10
descriptions. A predefined similarity score threshold of 0.37 is then applied. This specific threshold of
0.37 was established through testing on the training dataset, primarily to enhance NER F1 scores within
that dataset. Consequently, any extracted terms with a maximum cosine similarity score falling below
this threshold are deemed less relevant and are subsequently discarded. Note that this similarity score
threshold was determined prior to our ensemble process, meaning it might not represent the globally
optimal threshold for the entire pipeline. Achieving such an optimal threshold would necessitate an
end-to-end tuning approach, where the influence of the subsequent ensemble process is explicitly
factored into the optimization.</p>
          <p>Ensemble By combining the medical terms identified from the full discharge letter and its segmented
sections, using our two distinct prompting strategies, we increase the likelihood of achieving a more
comprehensive collection of relevant mentions as some terms that might be missed when the document
is processed solely as a whole or in isolated sections.</p>
          <p>The recognition of long-tail entities (highly specific or multi-word terms exceeding four words)
and nested entity mentions (where shorter, valid entities exist within longer ones) varies across
diferent language models. For instance, in the long-tail entity " Προκάρδιο άλγος από 3ημέρου
συσφικτικό με αντανάκλαση στον ΑΡ αγκώνα " (Precordial pain from 3 days constrictive with reflection to
the left elbow), some models, like Gemma-3, only extract "Προκάρδιο άλγος" (Precordial pain),
completely missing the extended description. Nested entities like "Συνδρομο Ταχυκαρδιας-Βραδυκαρδιας"
(Tachycardia-Bradycardia Syndrome), which contains the individual medical terms "Ταχυκαρδιας"
(Tachycardia) and "Βραδυκαρδιας" (Bradycardia), often results in diferent identifications across models.</p>
          <p>To mitigate the inconsistencies in identifying potential long-tail and nested entities across diferent
language models, we aggregated the terms identified by these models and applied selection criteria
based on either term length or majority voting. (1) Term length prioritization approach: for both
long-tail and nested entities, the longest term length method selected the most extensive span among
the extractions from diferent models. This ensured that more complete or encompassing phrases were
preferred. (2) Majority voting approach: selected an entity span only if it was extracted by more than
half of the models. This method utilizes consensus to enhance the reliability of the identified entities.</p>
          <p>Ensemble methods
Gemma-3’s Prompt 2 output with section-wise processing
(incorporating long-tail and nested entities from other LLMs’ extractions)
union Run 1nm’s output with Gemini’s Prompt 2 output with entire letter processing
Gemma-3’s Prompt 1 output with section-wise processing
(incorporating long-tail and nested entities from other LLMs’ extractions)
union Run 3nm’s output with Gemini’s Prompt 2 output with entire letter processing
Fused the outputs of Run 1nm, Run 3nm, and Gemini
(Use majority voting method)</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Entity linking phase</title>
          <p>We implemented search engine style algorithm for the entity linking task, the widely used eficient
two-phase hybrid retrieval system, to speed up the entity linking process. The first step is using the
high-recall bag-of-words retrieval function BM25 to eficiently narrow down the vast ICD-10 knowledge
base to a relatively small set of candidate codes for a given English translated mention extracted in NER
phase. The correct ICD-10 code is highly likely to be within this candidate set. The second step is to take
the ICD-10 candidates generated by BM25 and link to the ICD-10 code with the highest score by BERT
cross-encoder that can understand the semantic context of the mention and the entity descriptions with
higher precision.</p>
          <p>First-stage candidate generation: To generate a set of potential ICD-10 codes relevant to the
extracted Greek medical mentions, we employed a BM25 searcher. Our method involved preparing the
English ICD-10 knowledge base for eficient searching. We treated each ICD-10 code English category
definition as a distinct document. To strengthen the content of these "documents" and improve the
probable success of a match, we concatenated the textual description associated with each ICD-10 code
with the descriptions of all its more specific subcategories within the ICD-10 hierarchy of categories.
This efectively creates a more comprehensive textual representation for each ICD-10 concept. Following
this, for each English translation of a medical mention extracted from the Greek discharge letter, we
used it as a query to search across these constructed ICD-10 definition "documents" using the BM25
algorithm, which is defined as:</p>
          <p>Relevance Score (D, Q) = ∑︁ IDF() ·
∈</p>
          <p>(, ) · (1 + 1)
 (, ) + 1 · (1 −  +  · a|vgd|l )
(1)
where Q is an English medical term, IDF() represents the inverse document frequency weight of the
medical mention , and  (, ) denotes the frequency of a medical mention  within the description
document . || is the length of the description document, avgdl is the average document length in
the collection. 1 = 0.5 and  = 0.3 are assigned.</p>
          <p>Second-stage linking In the second-stage of EL, to refine the candidate ICD-10 codes retrieved
by BM25, we formulated new queries. These queries were constructed by combining the English
translation of the extracted Greek medical mention with the concise descriptions of these mentions
generated by the multilingual LLMs. Using the set of ICD-10 codes identified as potential candidates in
the BM25 retrieval stage, we then employed a MedCPT cross-encoder3 [18] to calculate a relevance
score between our augmented query (English translated mention + LLM-generated English description)
and constructed ICD-10 English definition documents. We established the link by selecting the ICD-10
code that received the highest relevance score from the MedCPT cross-encoder.</p>
          <p>To clarify the roles of our BERT models, the cross-encoder in this EL stage is designed for precise
identification of the definitive ICD-10 descriptions. This difers from the BERT bi-encoder, which served
as a filtering mechanism in the previous NER phase, merely removing potential false positives without
directly linking to ICD-10 codes. We must also note an oversight: we initially missed the ICD-10
candidate list provided in labelset.txt. This led us to implement a BM25 searcher for acceleration, though
its necessity is unlikely given that the list contains only 324 candidate codes.</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>3.2.3. Multi-label classification phase</title>
          <p>Our submissions for the MLC-X task were simply direct aggregations of all the ICD-10 codes mapped
during the EL phases.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussion</title>
      <sec id="sec-4-1">
        <title>4.1. Results</title>
        <p>Our system demonstrated generally poor precision across the NER, EL, and MLC-X tasks, which
significantly hindered our overall F1 scores. The primary reason for this lies in our prompt design,
3https://huggingface.co/ncbi/MedCPT-Cross-Encoder
particularly Prompt 1, which was optimized for maximizing recall. This led to the extraction of an
excessive number of medical terms that fell outside the cardiology domain (i.e., not in the code list in
labelset.txt file).</p>
        <p>Since we utilized a hierarchical framework system for the NER, EL and MLC-X tasks, the evaluations
of the EL results are directly influenced by the outputs of the NER phase. The correlations between
these two tasks are clearly visible in table 4 and 5. Among our five submissions, Run 1nm achieved the
highest precision for both the NER and EL tasks. Conversely, Run 4nm demonstrated the highest recall,
though its lowest precision resulted in the lowest F1 score. Run 5nm secured the highest F1 score overall,
despite not having the top recall or precision in either the NER or EL tasks. This outcome suggests that
our majority voting method, used to fuse the outputs from diferent LLMs, was particularly efective in
improving both precision and F1 scores by balancing these metrics.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Discussion</title>
        <p>Upon analyzing the performance of our system, we identified several key factors that significantly
influenced the outcomes of the 3 tasks.</p>
        <p>Translation accuracy: The accuracy of the initial translation of Greek entity names is paramount,
as it directly influences the success of subsequent steps. An incorrect English translation will result
in a mismatched entity description, inevitably leading to erroneous entity linking, regardless of the
linking model’s inherent capabilities. Language models tend to be more accurate when processing full
entity names compared to their abbreviated forms. Table 7 illustrates instances where Greek medical
abbreviations can have multiple English translations and consequently generate divergent descriptions,
ultimately causing the entity linking process to point to incorrect ICD-10 codes
ΣΝ
ΔΛΔ</p>
        <p>English Translation Generated Description</p>
        <p>annotation : I25 - Chronic ischemic heart disease
Coronary Artery Disease A condition in which plaque builds up inside the</p>
        <p>coronary arteries.</p>
        <p>Heart Failure A condition in which the heart is unable to pump</p>
        <p>enough blood to meet the body’s needs.</p>
        <p>annotation : E78 - Disorders of lipoprotein metabolism and other lipidemias
Dyslipidemia An abnormal amount of lipids (e.g., cholesterol,</p>
        <p>triglycerides) in the blood.</p>
        <p>Deep Vein Thrombosis A blood clot that forms in a deep vein, usually in</p>
        <p>the leg.</p>
        <p>Peripheral Artery Disease Condition where narrowed arteries reduce blood
flow to the limbs.
Prompt design and section-wise processing: The higher recall but higher false positive rate
observed using Prompt 1 with section-wise processing can be attributed to several factors. Language
models can exhibit positional biases, leading to missed entities in lengthy documents. Furthermore, in
the lengthy text, the signal for specific entities can be weakened by the presence of substantial irrelevant
information, making identification more challenging. Processing the document section by section
allows the model to focus on smaller chunks of text. This approach helps to overcome positional biases
and reduces the amount of noise within a given processing window, thereby improving the chances of
retrieving most of the desired entities. However, this segmentation can also lead to the extraction of
unwanted noise mentions that might not have been identified if the model had the broader context of
the entire document.</p>
        <sec id="sec-4-2-1">
          <title>Nested entity mention and long tail entity challenges: Further complicating the NER task are</title>
          <p>the inherent challenges of nested entity mentions and long tail entities. Our NER system significantly
struggled with both long-tail entities and nested entity mentions; even medical mentions extracted
by the LLMs that successfully passed the NER filter were ultimately considered incorrect in the NER
evaluations, despite being the terms we aimed to extract.</p>
          <p>Diferent LLMs often analyze and handle nested entity mentions in varying ways, leading to
discrepancies in their output. As seen in table 8, consider the nested entity "ΣΝ (PCI)". While "ΣΝ" represents
a type of disease, "PCI" denotes a specific heart treatment. The mention of "PCI" within " ΣΝ" indicates a
patient with the disease who has undergone PCI. Another example is "ΟΣΣ-PCI LAD," which includes
both "ΟΣΣ" (acute myocardial infarction) and "PCI LAD" (PCI specifically targeting the Left Anterior
Descending artery), describing a patient who received that targeted treatment. These deviations can also
afect the downstream EL results. Consider the long-tail entity " Υπόχρωμη μικροκυτταρική αναιμία"
(Hypochromic microcytic anemia), which is a specific type of " αναιμία" (anemia), and each of these
terms could potentially link to distinct ICD-10 codes.</p>
          <p>Despite our selection process prioritizing either the largest encompassing span or the entity span that
received the most votes from diferent model outputs, these methods often failed to correctly capture
such diverse long-tail and nested entities.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>English Translation</p>
      <p>Generated Description</p>
      <p>We developed a multi-stage, hierarchical system for the ELCardioCC task, focusing on medical entity
recognition and entity linking. While our system demonstrated moderate recall, its precision was
noticeably low. A clear trend of error propagation was observable across the multi-stage pipeline, directly
impacting results from the NER phase down to the EL and MLC-X tasks. A significant contributing factor
to this limitation is our system’s reliance on English-translated text for processing. The crucial bridge
between the Greek clinical corpus and its English translation is entirely dependent on the multilingual
LLMs, and the quality control of these translations remains beyond our current capabilities.</p>
      <p>To further improve precision and f1, we need to design more precise prompts that maintain moderate
recall, and implement more robust filtering mechanisms to efectively discard false positive mentions.
This is a key area for future development to refine the accuracy of our extracted and linked clinical
entities. A significant oversight on our part was neglecting the ICD-10 code list provided in the
labelset.txt file. This list would have dramatically reduced the search space for ICD-10 codes from over
10,000 to just 324 candidates. Using this candidate list could have substantially improved both the
precision and F1 scores in our ELCardioCC challenge submissions.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this manuscript, the author utilized Gemini as a language refinement tool.
All content was reviewed and edited by the author, who takes full responsibility for the accuracy and
integrity of the publication.
of the Association for Computational Linguistics and the 11th International Joint Conference on
Natural Language Processing (Volume 1: Long Papers), 2021, pp. 6214–6224.
[16] S. Ujiie, H. Iso, S. Yada, S. Wakamiya, E. Aramaki, End-to-end biomedical entity linking with
span-based dictionary matching, arXiv preprint arXiv:2104.10493 (2021).
[17] A. Balachandran, Medembed: Medical-focused embedding models, 2024. URL: https://github.com/
abhinand5/MedEmbed.
[18] Q. Jin, W. Kim, Q. Chen, D. C. Comeau, L. Yeganova, W. J. Wilbur, Z. Lu, Medcpt: Contrastive
pre-trained transformers with large-scale pubmed search logs for zero-shot biomedical information
retrieval, Bioinformatics 39 (2023) btad651.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dimitriadis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patsiou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stoikopoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Toumpas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kipouros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Papadopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bekiaridou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Barmpagiannos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vasilopoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barmpagiannos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Samaras</surname>
          </string-name>
          , G. Giannakoulas, G. Tsoumakas,
          <source>Overview of ElCardioCC Task on Clinical Coding in Cardiology at BioASQ</source>
          <year>2025</year>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , D. Spina (Eds.),
          <source>CLEF 2025 Working Notes</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nentidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Katsimpras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krallinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rodríguez-Ortega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Rodriguez-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Loukachevitch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sakhovskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Tutubalina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dimitriadis</surname>
          </string-name>
          , G. Tsoumakas,
          <string-name>
            <given-names>G.</given-names>
            <surname>Giannakoulas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bekiaridou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Samaras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. N. Maria</given-names>
            <surname>Di</surname>
          </string-name>
          <string-name>
            <surname>Nunzio</surname>
          </string-name>
          , Giorgio,
          <string-name>
            <given-names>S.</given-names>
            <surname>Marchesin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Martinelli</surname>
          </string-name>
          , G. Silvello, G. Paliouras,
          <source>Overview of BioASQ</source>
          <year>2025</year>
          :
          <article-title>The thirteenth BioASQ challenge on large-scale biomedical semantic indexing and question answering</article-title>
          , in: L.
          <string-name>
            <surname>P. A. G. S. d. H. J. M. F. P. P. R. D. S. G. F. N. F. Jorge Carrillo-de Albornoz</surname>
          </string-name>
          , Julio Gonzalo (Ed.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Sixteenth International Conference of the CLEF Association (CLEF</source>
          <year>2025</year>
          ),
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Tsatsaronis</surname>
          </string-name>
          , G. Balikas,
          <string-name>
            <given-names>P.</given-names>
            <surname>Malakasiotis</surname>
          </string-name>
          , I. Partalas,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zschunke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Alvers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Weissenborn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Petridis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Polychronopoulos</surname>
          </string-name>
          , et al.,
          <article-title>An overview of the bioasq large-scale biomedical semantic indexing and question answering competition</article-title>
          ,
          <source>BMC bioinformatics 16</source>
          (
          <year>2015</year>
          )
          <fpage>1</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Vulić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Korhonen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Collier</surname>
          </string-name>
          ,
          <article-title>Learning domain-specialised representations for crosslingual biomedical entity linking</article-title>
          ,
          <source>arXiv preprint arXiv:2105.14398</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Boudjellal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ahmad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Naseem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shang</surname>
          </string-name>
          , L. Dai,
          <article-title>Abioner: A bert-based model for arabic biomedical named-entity recognition</article-title>
          ,
          <year>Complexity 2021</year>
          (
          <year>2021</year>
          )
          <fpage>6633213</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E. T. Rubel</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Andrioli de Souza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Knafou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. E.</given-names>
            <surname>Oliveira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. B.</given-names>
            <surname>Gumiel</surname>
          </string-name>
          , L. F. de Oliveira,
          <string-name>
            <given-names>D.</given-names>
            <surname>Teodoro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. C.</given-names>
            <surname>Paraiso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Moro</surname>
          </string-name>
          , et al.,
          <article-title>Biobertpt: a portuguese neural language model for clinical named entity recognition</article-title>
          ,
          <source>in: Proceedings of the 3rd Clinical Natural Language Processing Workshop, 19 November</source>
          <year>2020</year>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-H.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Jang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. J.</given-names>
            <surname>Yum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Shin</surname>
          </string-name>
          , Y.
          <string-name>
            <surname>-M. Kim</surname>
            ,
            <given-names>H. J.</given-names>
          </string-name>
          <string-name>
            <surname>Joo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Song</surname>
          </string-name>
          ,
          <article-title>A pre-trained bert for korean medical natural language processing</article-title>
          ,
          <source>Scientific reports 12</source>
          (
          <year>2022</year>
          )
          <fpage>13847</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mitrofan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Păis</surname>
          </string-name>
          , ,
          <article-title>Improving romanian bioner using a biologically inspired system</article-title>
          ,
          <source>in: Proceedings of the 21st Workshop on Biomedical Language Processing</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>316</fpage>
          -
          <lpage>322</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ceccato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Fabbian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.-W.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. U.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          , et al.,
          <article-title>Seupd@ clef: Team hiball on incremental information retrieval system with rrf and bert</article-title>
          ,
          <source>in: CEUR WORKSHOP PROCEEDINGS</source>
          , volume
          <volume>3497</volume>
          ,
          <string-name>
            <surname>CEUR-WS</surname>
          </string-name>
          ,
          <year>2023</year>
          , pp.
          <fpage>2396</fpage>
          -
          <lpage>2415</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>B.-W.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Generative large language models augmented hybrid retrieval system for biomedical question answering</article-title>
          ,
          <source>CLEF Working Notes</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Logeswaran</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Zero-shot entity linking by reading entity descriptions</article-title>
          , arXiv preprint arXiv:
          <year>1906</year>
          .
          <volume>07348</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Garda</surname>
          </string-name>
          , U. Leser,
          <article-title>Belhd: improving biomedical entity linking with homonym disambiguation</article-title>
          ,
          <source>Bioinformatics</source>
          <volume>40</volume>
          (
          <year>2024</year>
          )
          <article-title>btae474</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>López-García</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Jerez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ribelles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Alba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. J.</given-names>
            <surname>Veredas</surname>
          </string-name>
          ,
          <article-title>Explainable clinical coding with in-domain adapted transformers</article-title>
          ,
          <source>Journal of Biomedical Informatics</source>
          <volume>139</volume>
          (
          <year>2023</year>
          )
          <fpage>104323</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <article-title>A joint model for medical named entity recognition and normalization</article-title>
          , Proceedings http://ceur-ws.
          <source>org ISSN 1613</source>
          (
          <year>2020</year>
          )
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <article-title>An end-to-end progressive multi-task learning framework for medical named entity recognition and normalization</article-title>
          ,
          <source>in: Proceedings of the 59th Annual Meeting</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>