<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>V. Davydova);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Overview of BioNNE Task on Biomedical Nested Named Entity Recognition at BioASQ 2024</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vera Davydova</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Natalia Loukachevitch</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elena Tutubalina</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Artificial Intelligence Research Institute</institution>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>HSE University</institution>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Kazan Federal University</institution>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Lomonosov Moscow State University</institution>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Recognition of nested named entities, which may contain each other, can enhance the coverage of found named entities. This capability is particularly useful for tasks such as relation extraction, entity linking, and knowledge graph population. This paper presents the organizers' report on the BioNNE competition, which focused on nested named entity recognition systems in medical texts for both English and Russian. The competition includes three subtasks: Bilingual, English-oriented, and Russian-oriented. Training and validation sets were derived from a subset of the NEREL-BIO dataset, a corpus of PubMed abstracts. For the BioNNE evaluation, eight of the most common medical entity types were selected from the original dataset. Additionally, a novel test set was developed for the shared task, consisting of 154 abstracts in both English and Russian. Held within the framework of the BioASQ workshop, the competition aims to advance research in nested NER within the biomedical domain.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;BioNLP</kwd>
        <kwd>Nested Named Entity Recognition</kwd>
        <kwd>Biomedical Text Mining</kwd>
        <kwd>Domain-specific Language Models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Nested Named Entity Recognition extends the capabilities of standard Named Entity Recognition (NER)
task by addressing the challenge of overlapping and nested entities within texts. While more complex, it
provides a richer and more nuanced understanding of the entities in a document, making it invaluable for
domains requiring precise and hierarchical entity extraction. One such domain is biomedical scientific
texts, where nested structures are common. Identification of named entities in scientific texts is essential
for extracting valuable information and progressing biomedical research. NNER involves recognizing
entities that are nested within other entities. It brings additional complexity by handling the hierarchical
and overlapping nature of entities within the biomedical field. Traditional NER systems often fail to
adequately capture this nested structure, leading to a loss of crucial information. Recent studies [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ]
involve leveraging sequence-to-sequence models and reinforcement learning to handle these nested
structures more eficiently.
      </p>
      <p>
        While most studies are focused on flat (non-nesting) NER tasks, there are few general-domain datasets
for nested entities [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
        ]. The GENIA corpus [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], a popular NER dataset within the biomedical domain,
includes over 100K annotations across 47 entity types, yet only 17% of the entities in the GENIA corpus
are nested within another entity [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. A recent dataset, NEREL [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], is annotated with over 56K named
entities of 29 types, while it’s biomedical extension, NEREL-BIO [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], is annotated with over 70k entities
of 37 types.
      </p>
      <p>
        This paper provides a comprehensive overview of the Biomedical Nested Named Entity Recognition
(BioNNE) task, predominantly focusing on the challenges and advancements in nested NER across
annotated PubMed abstracts. The shared task includes both English and Russian biomedical texts and
was part of the BioASQ Workshop 2024 [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The BioASQ shared tasks aim to advance systems that
exploit diverse and extensive online information to meet the information needs of biomedical scientists.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. BioNNE shared Task</title>
      <p>The main task is to extract and classify biomedical nested named entities mentioned in the unstructured
medical abstract text. The main task consists of three tracks: (i) Bilingual, (ii)English-oriented, and (iii)
Russian-oriented.</p>
      <p>• Bilingual: participants in this track were required to train a single multi-lingual NER model using
training data for both Russian and English languages. The model was supposed to be used to
generate prediction files for each language’s dataset. Predictions from any mono-lingual model
were not allowed in this track.
• English-oriented: participants in this track were required to train a nested NER model for English
scientific abstracts in the biomedical domain.
• Russian-oriented: participants in this track were required to train a nested NER model for Russian
scientific abstracts in the biomedical domain.</p>
      <p>The same predictions from track (i) were not allowed in tracks (ii) and (iii). Participants were allowed
to train any model architecture on any publicly available data to achieve the best performance.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <p>
        Training and validation sets for the BioNNE competition were based on a subset of NEREL-BIO dataset
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. NEREL-BIO is a corpus of PubMed abstracts written in Russian and English. It enhances the
NEREL [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] dataset, originally designed for the general domain, by incorporating biomedical entity
      </p>
      <sec id="sec-3-1">
        <title>1https://github.com/nerel-ds/NEREL-BIO/tree/master/bio-nne/ 2https://codalab.lisn.upsaclay.fr/competitions/16464</title>
        <p>Explanation Examples
does not have direct correspondence in keratoconus, stabilize the
glauUMLS, it conveys longer hospital stay, coma process
stopped the progression of the results of
scientific study described in the abstract
biological function or process in
organism including organism attribute
(temperature) and excluding mental processes
damage inflicted on the body as the direct
or indirect result
blood flow, childbirth, uterine
contraction, arterial pressure,
body temperature
overdosing, burn, drowning, of
external force including
poisoning falling, childhood trauma
any deviations from normal state of organ- appendicitis, haemorrhoids,
ism: diseases, symptoms, abnormality of magnesium deficiency
dysfuncorgan, excluding injuries or poisoning tions, Diabetes Mellitus, spine
pain, complication, bone cyst,
acute inflammation, deep vein
thrombosis
testing body substances and other diagnos- biochemical analysis,
polytic procedures merase chain reaction test,
such as ultrasonography
electrocardiogram, histological
comprises organs, body part, cells and cell eye, bone, brain, lower limb, oral
components cavity, blood, body substances
anterior lens capsule, right
ventricle, lymphocyte
chemicals including legal and illegal drugs, opioid, lipoprotein, iodine,
biological molecules adrenalin, memantine,</p>
        <p>molecules methylprednisolone
manufactured objects used for medical catheter, prosthesis, tonometer,
purposes tomograph removable
prosthesis, stent, metal stent
types. Biomedical entity types in NEREL-BIO are annotated according to UMLS definitions of relevant
concepts. All the abstracts are annotated in the BRAT format [11].</p>
        <p>Figures 1 and 2 present parallel examples of nested named entities in NEREL-BIO for one abstract.
Table 1 provides a comprehensive list of entity types, along with their explanations and examples.</p>
        <p>Compared to the original NEREL-BIO dataset, we fixed some annotators’ errors, merged PRODUCT
and DEVICE type classes into DEVICE class and selected the eight most common medical entities from
the dataset: FINDING, DISO, INJURY_POISONING, PHYS, DEVICE, LABPROC, ANATOMY, CHEM.
The resulting dataset comprises 662 annotated PubMed abstracts in Russian and 104 parallel abstracts
in Russian and English. 104 parallel abstracts were randomly split for training and validation sets for
each subtask.</p>
        <p>A novel test set was developed for the shared task, consisting of 154 abstracts in English and Russian.
To avoid manual annotation, 346 extra files were added for each language, resulting in 500 abstracts for
each of the target languages. These supplementary files were excluded from the final evaluation.</p>
        <p>Table 2 shows the number of entities represented in each part of the data set. Observations can be
summarized as follows. Entities labeled as DISO and ANATOMY are the most frequent across all sets,
with DISO being particularly prevalent in both training and test sets. Categories such as DEVICE and
INJURY_POISONING have a much lower number of entities compared to others, highlighting potential
areas where entity recognition might be more challenging due to data sparsity. The number of entities
in the English test set (EN_test) and the Russian test set (RU_test) are relatively comparable, although
slight variations are observed, particularly in the ANATOMY and DISO categories.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <sec id="sec-4-1">
        <title>4.1. Evaluation Metric</title>
        <p>F1 was used as the main evaluation metric. It is calculated according to the following formula:
1 = 1 ∑︁ 1</p>
        <p>∈
where  ={“FINDING”, “DISO”, “INJURY_POISONING”, “PHYS”, “DEVICE”, “LABPROC”,
“ANATOMY”, “CHEM”},  is the size of , 1 is macro F1-score averaged over all entity classes.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Baseline Solution</title>
        <p>We leveraged the BINDER model [12] as a baseline solution for the BioNNE task. BINDER utilizes
two encoders to map text and entity types into a shared vector space. It eficiently reuses vector
representations of entity types for various text spans (or vice versa), leading to accelerated training and
inference speeds. Leveraging bi-encoder representations, BINDER introduces a contrastive learning
framework for NER. This framework facilitates similarity between the representations of entity types and
their corresponding mentions while encouraging dissimilarity with non-entity text spans. Additionally,
BINDER introduces a dynamic thresholding loss in contrastive learning. During testing, it employs
candidate-specific dynamic thresholds to diferentiate entity spans from non-entity ones. For our
backbone model, we utilized the multilingual BERGAMOT model3 [13], which is pre-trained on the
Unified Medical Language System (UMLS) (version 2020AB) using a Graph Attention Network (GAT)
encoder [14]. The best-performing results were achieved with the following hyperparameters: a learning
rate of 3e-5 and 5 training epochs. AdamW was used as the optimizer [15].</p>
        <p>To address the cross-lingual transfer problem, the baseline model was trained and evaluated on various
language variations: RU, EN, and RU+EN. The highest scores were achieved on the combined RU and
EN subsets (see Tab. 3). Additionally, models that were trained on one language showed comparatively
high results when evaluated on the other language, with training on Russian data proving to be more
efective. This efectiveness can be attributed to the diference in the size of the training data. Thus,
combined with the results from the participants’ models, we can conclude that cross-lingual techniques
can be efectively applied to the NNER task.
(1)</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Oficial BioNNE Results</title>
        <p>We observed a strong interest in the shared task, with 26 teams registered in CodaLab. We have received
155 submissions from 5 teams. One team opted to withdraw their results from the oficial publication.
We summarized performance for all tracks in Table 3. Below, we give an overview of these approaches.</p>
        <p>Team fulstock achieved the best results by using the BINDER model. In contrast with the baseline
architecture, it has XLM-RoBERTa [16] as a backbone model. The participant experimented with
diferent ways of entity type description (prompts) for BINDER learning. The following variants of
prompts were used: keyword (name of the entity type); 2, 5 or 10 the most frequent component words
for entity type in the training data, contextual prompt (an example of a sentence with the target entity),
lexical prompt (an example of a sentence, in which the target entity is masked with the entity label) [17].
The model was trained during 64 epochs. Results have shown that contextual Russian named entity
type description proved to be the best option for the bilingual track (achieving 0.704 in F1-score), while
for the Russian-only track, one worked the best (F1-score is 0.698). In the English track, the 10 most
frequent English components prompt resulted in the 0.6181 F1-score. Thus, these prompts benefited
from getting first place in the BioNNE competition on all three tracks.</p>
        <p>Team wenxinzh [18] used the combination of a pre-trained Mixtral model [19] and
en_ner_bc5cdr_md, a spaCy NER model trained on the BC5CDR corpus [20]. They also adapted
and customized rules based on semantic types of UMLS (Unified Medical Language System). First, the
system uses Mixtral and en_ner_bc5cdr_md to extract potential entities for each category from the text.
Then, the system finds the UMLS semantic type associated with the entities to determine their final
entity types. The team applied the system to the English subtask and achieved third place in the overall
results, with an F1 score of 0.348.</p>
        <p>Team hasin.rehana [21] processed the BioNNE dataset by splitting each abstract into sentences and
mapping the corresponding annotations to these sentences. Then, they implemented the BIO-tagging
scheme, a well-known method for named entity recognition encoding. Tokens were encoded as B-TYPE
for the beginning of an entity, I-TYPE for subsequent tokens of the same entity, and O for tokens that
do not belong to any entity class. Overall, six levels of BIO-tagging were applied to the BioNNE dataset.
The core of the model for English NNER is the pre-trained PubMedBERT, which provides contextualized
word embeddings [22]. For the Russian NNER task, the team used a pre-trained SBERT-Large-NLU-RU
model4. For Bilingual NNER, they have employed BERT-Base-Multilingual-uncased[23]. A series of six
classification layers were added to the base model. Each layer was designed to output a specific level of
NER tags, with each linear layer taking the hidden states from PubMedBERT and mapping them to the
required number of labels for that layer. Although the original number of classes in the BioNNE dataset
is eight, the total number of output classes for each classification layer is 17 to support the preprocessed
BIO-tagged dataset. This includes “B-Class” and “I-Class” for each of the eight original classes, as well
as “O” class for any token that does not belong to any entity class. To enrich the NER process, the
team leveraged the UMLS Metathesaurus for vocabulary expansion. They utilized the MRCONSO.RRF
data file within UMLS to extract relevant concepts and their child concepts based on the concept IDs</p>
        <sec id="sec-4-3-1">
          <title>3https://huggingface.co/andorei/BERGAMOT-multilingual-GAT</title>
          <p>4https://huggingface.co/ai-forever/sbert_large_nlu_ru
provided by the BioNNE challenge organizers, which broadened the model’s ability to recognize entities
by incorporating synonyms and related terms. For these experiments, the team has employed 6 NVIDIA
Tesla V100 GPUs with 32GB of HBM2 VRAM each.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we present the organizers’ report on the competition for nested named entity recognition
systems in the biomedical domain (BioNNE). The competition included three subtasks: Bilingual,
English-oriented, and Russian-oriented. The participants were asked to extract the eight most common
medical entities, both in Russian and English, which can contain each other, from PubMed abstracts.
The best results in the evaluation were achieved by using the BINDER model based on bi-encoder
representations and a contrastive learning framework. The winner experimented with diferent ways of
entity type description (prompts) for BINDER learning. We hope that the outcomes of the competition
will foster further research and development in nested NER for healthcare applications.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The work of E.T. has been supported by the Russian Science Foundation grant # 23-11-00358. We would
like to thank all the participating teams who contributed to the success of the shared task through their
interesting approaches and experiments.
twelfth BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering,
in: L. Goeuriot, P. Mulhem, G. Quénot, D. Schwab, L. Soulier, G. Maria Di Nunzio, P. Galuščáková,
A. García Seco de Herrera, G. Faggioli, N. Ferro (Eds.), Experimental IR Meets Multilinguality,
Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF
Association (CLEF 2024), 2024.
[11] P. Stenetorp, S. Pyysalo, G. Topić, T. Ohta, S. Ananiadou, J. Tsujii, brat: a web-based tool for
NLP-assisted text annotation, in: Proceedings of the Demonstrations Session at EACL 2012,
Association for Computational Linguistics, Avignon, France, 2012.
[12] S. Zhang, H. Cheng, J. Gao, H. Poon, Optimizing bi-encoder for named entity recognition via
contrastive learning, in: The Eleventh International Conference on Learning Representations,
2022.
[13] A. Sakhovskiy, N. Semenova, A. Kadurin, E. Tutubalina, Biomedical entity representation with
graph-augmented multi-objective transformer, in: Findings of the Association for Computational
Linguistics: NAACL 2024, Association for Computational Linguistics, Mexico City, Mexico, 2024.
[14] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, Y. Bengio, Graph attention networks,
arXiv preprint arXiv:1710.10903 (2018).
[15] I. Loshchilov, F. Hutter, Decoupled weight decay regularization, in: International Conference on</p>
      <p>Learning Representations, 2019. URL: https://openreview.net/forum?id=Bkg6RiCqY7.
[16] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott,
L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning at scale, CoRR
abs/1911.02116 (2019). URL: http://arxiv.org/abs/1911.02116. arXiv:1911.02116.
[17] I. Rozhkov, N. Loukachevitch, Prompts in few-shot named entity recognition, Pattern Recognition
and Image Analysis 33 (2023) 122–131.
[18] W. Zhou, Biomedical Nested NER with Large Language Model and UMLS Heuristics, in: CLEF</p>
      <p>Working Notes, 2024.
[19] A. Q. Jiang, A. Sablayrolles, A. Roux, A. Mensch, B. Savary, C. Bamford, D. S. Chaplot, D. de las
Casas, E. B. Hanna, F. Bressand, G. Lengyel, G. Bour, G. Lample, L. R. Lavaud, L. Saulnier, M.-A.
Lachaux, P. Stock, S. Subramanian, S. Yang, S. Antoniak, T. L. Scao, T. Gervet, T. Lavril, T. Wang,
T. Lacroix, W. E. Sayed, Mixtral of experts, 2024. arXiv:2401.04088.
[20] J. Li, Y. Sun, R. J. Johnson, D. Sciaky, C. Wei, R. Leaman, A. P. Davis, C. J. Mattingly, T. C.</p>
      <p>Wiegers, Z. Lu, Biocreative V CDR task corpus: a resource for chemical disease relation extraction,
Database J. Biol. Databases Curation 2016 (2016). URL: https://doi.org/10.1093/database/baw068.
doi:10.1093/database/baw068.
[21] H. Rehana, B. Bansal, N. Bengisu Çam, J. Zheng, Y. He, A. Özgür, J. Hur, Nested Named Entity</p>
      <p>Recognition using Multilayer BERT-based Model, in: CLEF Working Notes, 2024.
[22] Y. Gu, R. Tinn, H. Cheng, M. Lucas, N. Usuyama, X. Liu, T. Naumann, J. Gao, H. Poon,
Domainspecific language model pretraining for biomedical natural language processing, ACM Transactions
on Computing for Healthcare (HEALTH) 3 (2021) 1–23.
[23] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers
for language understanding, CoRR abs/1810.04805 (2018). URL: http://arxiv.org/abs/1810.04805.
arXiv:1810.04805.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <article-title>Gaussian prior reinforcement learning for nested named entity recognition</article-title>
          ,
          <source>arXiv preprint arXiv:2305.12003</source>
          (
          <year>2023</year>
          ). URL: https://arxiv.org/ abs/2305.12003.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Wajsburt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Taillé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Tannier</surname>
          </string-name>
          ,
          <article-title>Efect of depth order on iterative nested named entity recognition models</article-title>
          ,
          <source>arXiv preprint arXiv:2104.00542</source>
          (
          <year>2021</year>
          ). URL: https://arxiv.org/abs/2104.00542.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>U.</given-names>
            <surname>Yaseen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schütze</surname>
          </string-name>
          ,
          <article-title>Linguistically informed relation extraction and neural architectures for nested named entity recognition in bionlp-ost</article-title>
          <year>2019</year>
          , arXiv preprint arXiv:
          <year>1910</year>
          .
          <volume>03549</volume>
          (
          <year>2019</year>
          ). URL: https://arxiv.org/abs/
          <year>1910</year>
          .03549.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ming</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <article-title>An, Few-shot nested named entity recognition</article-title>
          ,
          <source>arXiv preprint arXiv:2212.00968</source>
          (
          <year>2022</year>
          ). URL: https://arxiv.org/abs/2212.00968.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Loukachevitch</surname>
          </string-name>
          , E. Artemova,
          <string-name>
            <given-names>T.</given-names>
            <surname>Batura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Braslavski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ivanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Manandhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pugachev</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Rozhkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shelmanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Tutubalina</surname>
          </string-name>
          , et al.,
          <article-title>Nerel: a russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links, Language Resources and Evaluation (</article-title>
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.</given-names>
            <surname>Artemova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zmeev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Loukachevitch</surname>
          </string-name>
          , I. Rozhkov,
          <string-name>
            <given-names>T.</given-names>
            <surname>Batura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ivanov</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Tutubalina,</surname>
          </string-name>
          <article-title>Runne2022 shared task: Recognizing nested named entities</article-title>
          ,
          <source>Komp'juternaja Lingvistika i Intellektual'nye Tehnologii</source>
          <year>2022</year>
          (
          <year>2022</year>
          )
          <fpage>33</fpage>
          -
          <lpage>41</lpage>
          . doi:
          <volume>10</volume>
          .28995/2075-7182-2022-21-33-41.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>J.-D. Kim</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Ohta</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Tateisi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Tsujii</surname>
          </string-name>
          ,
          <article-title>Genia corpus-a semantically annotated corpus for biotextmining</article-title>
          ,
          <source>Bioinformatics</source>
          <volume>19</volume>
          (
          <year>2003</year>
          )
          <fpage>i180</fpage>
          -
          <lpage>i182</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Katiyar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cardie</surname>
          </string-name>
          ,
          <article-title>Nested named entity recognition revisited</article-title>
          ,
          <source>in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , volume
          <volume>1</volume>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>N.</given-names>
            <surname>Loukachevitch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Manandhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Baral</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Rozhkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Braslavski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ivanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Batura</surname>
          </string-name>
          , E. Tutubalina,
          <article-title>NEREL-BIO: A Dataset of Biomedical Abstracts Annotated with Nested Named Entities</article-title>
          ,
          <string-name>
            <surname>Bioinformatics</surname>
          </string-name>
          (
          <year>2023</year>
          ). URL: https://doi.org/10.1093/bioinformatics/btad161. doi:
          <volume>10</volume>
          .1093/ bioinformatics/btad161,
          <fpage>btad161</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nentidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Katsimpras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krithara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lima-López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Farré-Maduell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krallinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Loukachevitch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Davydova</surname>
          </string-name>
          , E. Tutubalina, G. Paliouras,
          <source>Overview of BioASQ</source>
          <year>2024</year>
          : The
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>