<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>S. Sahil);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Leveraging Biomedical Ontologies to Boost Performance of BERT-Based Models for Answering Medical MCQs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sahil Sahil</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>P Sreenivasa Kumar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science &amp; Engineering, Indian Institute of Technology Madras</institution>
          ,
          <addr-line>Chennai</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>Large-scale pretrained language models like BERT have shown promising results in various natural language processing tasks. However, these models do not benefit from the rich knowledge available in domain ontologies. In this work, we propose BioOntoBERT, a BERT-based model pretrained on multiple biomedical ontologies. We also introduce the Onto2Sen system to process various ontologies to generate lexical documents, such as entity names, synonyms and definitions, and concept relationship documents. We then incorporate these knowledge-rich documents during pretraining to enhance the model's “understanding” of the biomedical concepts. We evaluate our model on the MedMCQA dataset, a multiple-choice question-answering benchmark for the medical domain. Our experiments show that BioOntoBERT outperforms the baseline model BERT, SciBERT, BioBERT and PubMedBERT. BioOntoBERT achieves this performance improvement by incorporating only 158MB of ontology-generated data on top of the BERT model during pretraining, just 0.75% of data used in pretraining PubMedBERT. Our results demonstrate the efectiveness of incorporating biomedical ontologies in pretraining language models for the medical domain.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Biomedical Ontologies</kwd>
        <kwd>BERT</kwd>
        <kwd>Medical Multiple Choice Question Answering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Biomedical ontology research encompasses a variety of entities (from dictionaries of names
for biological products to controlled vocabularies to principled knowledge structures) and
processes (i.e., acquisition of ontological relations, integration of heterogeneous databases, use
of ontologies for reasoning about biological knowledge) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Biomedical ontologies include
various aspects of medical terminologies such as symptoms, diagnosis and treatment.
      </p>
      <p>Multiple-choice question-answering (MCQA) is a challenging task in general and in
particular, in the domain of the medical field as the relevant knowledge is not commonly available
in text corpora. The success of MCQA systems relies on striking a delicate balance between
language understanding, domain-specific reasoning, and the incorporation of rich knowledge
sources.</p>
      <p>In the medical domain, the use of ontology-based QA systems has a very good potential
to efectively capture domain-specific knowledge and provide accurate responses to medical
queries. By harnessing biomedical ontologies, these systems can depict intricate relationships
among medical concepts, resulting in more precise and contextually aware answers.</p>
      <p>
        Ontology-based multiple-choice question-answering systems are few in number, but
Ontology-based QA systems have shown promise in capturing domain-specific knowledge and
accurately answering medical questions [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. By leveraging biomedical ontologies, these
systems can represent complex relationships between medical concepts, enabling more precise and
contextually aware responses. A major limitation is that using these systems requires an
understanding of the ontology structure in order to formulate queries. For example, queries may
necessitate using intermediate concepts in the ontology when there is no direct relationship
between the concepts in question.
      </p>
      <p>
        Contextual word embedding models, such as BERT (Bidirectional Encoder Representations
from Transformers) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] have achieved state-of-the-art results in many NLP tasks. Initially
tested in a general domain, models such as BioBERT[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], UmlsBERT [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], SciBERT [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and
PubmedBERT [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], have also been successfully applied in the biomedical domain by pretraining
them on biomedical corpora. However, current biomedical applications of transformer-based
NLP models do not incorporate structured expert domain knowledge from a biomedical
ontology into their embedding pretraining process.
      </p>
      <p>To illustrate the significance of biomedical ontology knowledge, let’s consider a scenario
where a medical question pertains to a specific rare disease. While a pretrained language
model trained on a vast corpus may have encountered related terms or phrases, it may lack
the medical domain-specific knowledge required to provide accurate and nuanced answers.
In contrast, a biomedical ontology encompasses structured and domain-specific knowledge,
including relationships, hierarchies, and semantic information about medical concepts. By
integrating such ontology knowledge into our models, we can tap into a comprehensive and
precise representation of medical domain knowledge, enabling more accurate and
contextualized question-answering.</p>
      <p>In light of this research gap, our study aims to bridge the divide between ontology-based
approaches and deep learning models in the context of MCQA in the medical domain. Specifically,
our objectives are:
• To overcome the challenges of ontology injection, including the computational overhead
and annotation burden associated with large biomedical ontologies.
• To investigate techniques for integrating biomedical ontological knowledge with
pretrained BERT models in MCQA systems.</p>
      <p>
        In this paper, we present a novel approach that bridges the gap between ontology-based
methods and pretrained language models, harnessing the strengths of both to enhance
multiplechoice question-answering (MCQA) in the medical domain. Our contributions to this work can
be summarized as follows:
1. Onto2Sen, a simple yet efective solution for Ontology Injection: We propose a unique
solution called Onto2Sen system to generate a comprehensive ontology-backed sentence
corpus, which serves as a valuable resource for enriching pretrained models with
domainspecific knowledge. By incorporating this rich semantic information from biomedical
domain ontologies into the models, we anticipate enhancing their contextual understanding
and reasoning abilities.
2. Introducing BioOntoBERT: We propose BioOntoBERT, a pretrained BERT model that
leverages various Biomedical Ontologies using the Onto2Sen generated corpus.
BioOntoBERT surpasses several other biomedical BERT models, including PubmedBERT [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
SciBERT[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and BioBERT [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], in terms of performance for multiple-choice question
answering on the MedMCQA dataset.
      </p>
      <p>Furthermore, BioOntoBERT demonstrates remarkable performance with just 158MB of
pretraining data, significantly reducing the computational cost and carbon footprint associated
with larger models. This aspect makes our novel approach not only efective but also
environmentally friendly, addressing the growing concerns regarding energy consumption in deep
learning models and highlighting the power of knowledge.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>Biomedical Multiple Choice Question Answering (MCQA) is a significant task in natural
language processing. Various approaches have been proposed to improve the performance of
MCQA systems by leveraging ontologies and pretrained language models.</p>
      <p>
        As mentioned earlier, Ontology-based MCQA models are relatively limited, while
Ontologybased question-answering systems have shown promise in capturing domain-specific
knowledge and providing accurate answers to medical questions. For instance, in the case of XMQAS
proposed by Midhunlal et al.[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], the system utilized natural language processing techniques
and ontology-based analysis to process medical queries and extract relevant information from
medical documents. Other approaches, like the one presented by Kwon et al.[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for
strokerelated knowledge retrieval, employed SPARQL templates and medical knowledge QA query
ontology to transform queries into executable SPARQL queries for retrieving medical
knowledge. However, these approaches have limitations due to their reliance on a template-based
approach, which may restrict the flexibility and adaptability of the system.
      </p>
      <p>
        In addition to ontology-based approaches, using pretrained models has significantly
advanced MCQA systems. One notable example is PubmedBERT [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], a variant of BERT designed
explicitly for biomedical text comprehension. These pretrained models, including
PubmedBERT, have showcased remarkable performance in capturing medical terminologies and
comprehending complex medical questions. Moreover, models like BioBERT [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], SciBERT[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and
UmlsBERT [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] have been finetuned for biomedical NLP tasks, exhibiting improved performance
in various medical question-answering and information retrieval tasks. It is worth noting that
these models are pretrained on extensive corpora, such as Pubmed abstracts entire medical
dataset, which consists of over 3.1 billion words.
      </p>
      <p>
        Less amount of work has been done in using external knowledge with neural networks in
the biomedical multiple choice question answering domain, whereas in other domains like
common sense reasoning several diferent approaches have been investigated for leveraging
external knowledge sources. Sap et al.[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] introduce the ATOMIC graph with 877k textual
descriptions of inferential knowledge (e.g. if-then relation) to answer causal questions. Lv et
al.[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] propose to extract evidence from both structured knowledge bases such as ConceptNet
and Wikipedia text and conduct graph-based representation and inference for commonsense
reasoning.
      </p>
      <p>
        He et al.[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] proposed a training procedure to infuse disease knowledge and augment
pretrained BERT models. Their experiments demonstrated improved performance in consumer
health question answering, medical language inference, and disease name recognition. This
motivates us to leverage the strengths of ontology which excel at representing complex
medical concepts and terminologies. By integrating ontology and BERT-based models, we aim to
enhance the capabilities of our MCQA system and improve its accuracy and efectiveness in
addressing biomedical questions.
      </p>
      <p>
        To bridge the gap between ontology-based approaches and deep learning models, the
authors of [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] have explored techniques for ontology injection and infusing context.
These approaches aim to enhance the models’ language understanding and domain-specific
reasoning capabilities by injecting ontological information into the models by modifying or
adding new BERT layers or mapping the concepts and relationships of the ontology to the
data. However, these models face various challenges in processing and incorporating large
biomedical ontologies. The computational overhead required to handle and integrate the vast
knowledge in such ontologies can be significantly high. Moreover, the process of mapping the
ontology with the dataset and preparing annotated data demands substantial time and labour
resources. The manual efort required for this task can be burdensome, hindering the scalability
and practicality of these approaches.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Biomedical Ontologies</title>
      <p>
        Biomedical ontologies play a critical role in the field of medicine by organizing and
representing knowledge related to diseases, genes, anatomical structures, and medical concepts. They
establish a standardized framework that captures and integrates information, promoting data
sharing, interoperability, and knowledge discovery. We now briefly describe the prominent
biomedical ontologies we use for our model:
1. Disease Ontology (DO) [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] (v1.2): The Disease Ontology is a standardized ontology
created to ofer the biomedical community consistent, reusable, and sustainable
descriptions of human disease terms, phenotype characteristics, and related medical vocabulary
disease concepts.
2. Gene Ontology (GO) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] (v2023-04-01): It is a widely used ontology that focuses on
representing the functional attributes of genes and gene products across diferent species.
GO encompasses three main domains: Biological Process (BP), Molecular Function (MF),
and Cellular Component (CC). BP describes biological processes in which genes are
involved, MF represents the molecular functions they perform, and CC defines their cellular
locations.
3. Foundational Model of Anatomy Ontology (FMAO) [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] (v5.0.0): FMAO is an
ontology that aims to represent human anatomy in a detailed and structured manner. FMAO
provides a hierarchical organization of anatomical structures, capturing spatial
relationships and functional associations between diferent body parts.
4. Precision Medicine Ontology [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] (v4.0): It is a comprehensive ontology that
represents medical concepts and their relationships in a standardized manner. Medicine
Ontology covers various medical domains, including diseases, symptoms, treatments,
diagnostic procedures, and medical devices.
5. Bioassay Ontology (BAO) [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] (v1.1): The BAO focuses on establishing common
reference metadata terms and definitions required for describing relevant information of
low-and high-throughput drug and probe screening assays and results.
6. Dental Ontology [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] (v2016-06-27): It captures dental-related concepts and
relationships, providing a standardized vocabulary for representing dental conditions,
procedures, materials, and anatomical structures. It facilitates the integration of dental data
and knowledge, supporting research, education, and clinical practice in dentistry.
7. Pediatrics Ontology (v2.0): Ontology focuses on representing pediatric
healthcarerelated concepts and their relationships. It covers various aspects of pediatric medicine,
including diseases, developmental milestones, treatments, and interventions.
8. Human Physiology Simulation Ontology (HPSO) [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] (v1.1.1): HPSO captures the
concepts and relationships related to the simulation and modelling of human physiology.
It provides a standardized framework for representing physiological processes, organ
interactions, and computational models.
9. Mental Disease Ontology (MDO) [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] (v2020-04-26): MDO represents mental
disorders and related concepts. It ofers a standardized vocabulary for categorizing and
annotating mental diseases, symptoms, treatments, and diagnostic criteria.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>
        In this section, we present our approach for pretraining and fine-tuning a BERT[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] model
on biomedical ontologies for multiple-choice question answering on the MedMCQA dataset.
Our approach involves several key steps: data preparation, pretraining on biomedical
ontologies, and fine-tuning the MedMCQA dataset. The code implementation is publicly available on
GitHub1.
      </p>
      <sec id="sec-4-1">
        <title>4.1. Datasets</title>
        <sec id="sec-4-1-1">
          <title>4.1.1. Multiple Choice Questions Dataset</title>
          <p>
            We use the MedMCQA dataset[
            <xref ref-type="bibr" rid="ref25">25</xref>
            ], which consists of 1,94,000 multiple-choice questions on
around 2400 healthcare topics and 21 medical subjects from one of the toughest entrance exams
conducted for medical graduates in India, i.e., AIIMS and NEET PG. The diversity of questions
in the MedMCQA makes it a challenging dataset containing many aspects of medical
knowledge; Table 2 illustrates one such example. Another distinguishing factor of this dataset is its
questions are created for and by human experts. The dataset has three parts: the training set of
1,82,822 questions, the validation set of 4183 and the test set comprising 6150 questions, with
an average token length of 12.35, 13.91 and 9.68, respectively. The answer choices are provided
in the ‘labels’ column, encoded as integers 0, 1, 2, and 3. The ground truth for the test set is
not publicly available. Hence we will be analysing the results on the validation set.
          </p>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.1.2. Ontology-based Sentence Generation</title>
          <p>We propose a system called Onto2Sen to generate sentences from multiple ontologies curated
from public resources mentioned in the previous section. It extracts concepts, annotations, and
their properties from the ontology to form meaningful sentences. Onto2Sen preprocesses the
ontologies and generates two types of sentences. The first type of sentence generated is from
the subClass relationships. The second type of sentence is extracted from the relevant lexical
annotation axioms in the ontology.</p>
          <p>In the example shown in Figure 1, the Class Hierarchy Relationship sentences will contain the
subClass property in the Disease Ontology (DO) allowing us to identify specific disease
classiifcations. For instance, we can state that ‘SPOAN syndrome is a neurodegenerative disease’
using labels and identifiers in subClass relations. In addition, the transitive nature of the subclass
properties is also utilized. Furthermore, Annotation Properties associated with diseases ofer
valuable insights into symptoms, synonyms and causal associations. For instance, we can
describe that “SPOAN syndrome has synonym Spastic paraplegia” using the ‘has_exact_synonym’
annotation property.
We then used a natural language processing tool, spaCy, for preprocessing the compiled
documents. We use these generated sentences as input to the model during pretraining to leverage
the ontological knowledge.</p>
          <p>After a study of the ontologies mentioned in Section 3, we find that using annotation
properties and the class hierarchy for sentence generation is commonly applicable across all these
ontologies and hence we adopt only these techniques for the present.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Pretraining Model</title>
        <p>
          Pretraining is a crucial aspect of the BERT (Bidirectional Encoder Representations from
Transformers) [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] model, which has revolutionized the field of natural language processing. In the
context of BERT, pretraining refers to the initial phase where the model is trained on vast
amounts of unlabeled text data, such as web documents or books. During this pretraining
phase, BERT learns to generate contextualized representations of words and capture intricate
semantic relationships by leveraging the bidirectional nature of transformers.
        </p>
        <p>We propose a novel approach using Biomedical ontologies to pretrain the BERT model. As
mentioned in the previous section, Onto2Sen can generate a corpus of meaningful sentences
from diferent Biomedical ontologies. We use this generated corpus consisting of about 20M
words which is a substantial volume of unlabeled text data related to the medical domain. The
corpus was preprocessed and prepared for training, ensuring it was suitable for the subsequent
steps.</p>
        <p>The BERT model’s pretraining phase involves two tasks: Masked Language Modelling
(MLM) and Next Sentence Prediction. However, for our model, which incorporates
biomedical ontologies, we focus on augmenting the Masked LM task and omit the Next Sentence
Prediction task.</p>
        <p>In the Masked LM task, we masked out 15 per cent of tokens in a sentence, and the model
is trained to predict the original tokens given the context of the surrounding words. This
approach will help the semantic understanding of medical terminologies by directly injecting
biomedical ontology concepts and properties into the input sequence. As a result, the model
will recognise and better understand medical concepts and terminologies efectively.</p>
        <p>During the pretraining process, the BERT model was trained using the Adam optimizer, a
widely adopted optimization algorithm for neural networks. The optimizer iteratively adjusted
the model’s parameters to minimize a predefined loss function, optimizing its ability to capture
language patterns. Additionally, a learning rate scheduler was employed to dynamically adjust
the learning rate at specific intervals, facilitating improved convergence and optimization of
the model. The scheduler strategy, such as linear or exponential decay, was carefully selected
based on experimentation and optimization.</p>
        <p>These pretraining steps establish a well-built foundation for subsequent finetuning and
proifcient utilization of the BioOntoBERT model across diverse downstream natural language
processing tasks.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Finetuning BERT</title>
        <p>During the fine-tuning stage, we aim to train our BioOntoBERT model to accurately answer
multiple-choice questions on the MedMCQA dataset without using any external context.</p>
        <p>Each multiple-choice question in the MedMCQA dataset was concatenated with its answer
options to form a single input sequence of the form as shown in Figure 2.</p>
        <p>Next, we performed tokenization on the dataset. Tokenization involves breaking down the
questions and answers choices into smaller units called tokens, which the model can handle.
This step ensures that the data is in a format suitable for the BioOntoBERT model to process.
After the dataset is properly tokenized, we then train the BioOntoBERT model on this data.</p>
        <p>During training, the model learned from the dataset by adjusting its internal parameters
to better capture the relationships between questions and answer choices. The goal was to
enhance the model’s capacity to accurately choose the right answer when presented with a
question. In this case, the labels were encoded in a one-hot format derived from integers.
Throughout the training process, the model iteratively refined its understanding of the task by
analyzing the patterns and context in the data. We carefully optimized the model’s performance
by adjusting various parameters, such as the learning rate and the number of training epochs.</p>
        <p>Once the training was completed, we evaluated the performance of the finetuned
BioOntoBERT model using the validation dataset. This evaluation allowed us to measure how well
the model performed on unseen data and provided valuable insights into its ability to answer
multiple-choice questions accurately.</p>
        <p>During the fine-tuning process and subsequent evaluation of the BioOntoBERT model, a
probability distribution is generated for each question’s answer choices. The output probability
distribution is denoted by 1, 2, 3 and 4 as shown in Figure 2. We identify the most likely
answer choice by choosing the index associated with the highest probability.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>
        The main objective of this paper is to investigate the impact of incorporating biomedical
ontology into the pretraining process of BERT models for the task of medical multiple-choice
question answering. To achieve this objective, we developed a new pretrained model,
BioOntoBERT, that is pretrained on a combination of 9 biomedical ontologies. We evaluated the
performance of BioOntoBERT on the MedMCQA dataset, which contains a set of challenging
medical questions curated by medical experts and compared it to the performance of other
pretrained models, such as PubMedBERT[
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], SciBERT[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and BioBERT[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        We conducted the pretraining of our BioOntoBERT model using the BERT base architecture,
pretrained on English Wikipedia and BooksCorpus for 1M steps. BioOntoBERT was pretrained
for 200K steps. The pretraining process involved a batch size of 32 and a learning rate
scheduling of 5e-5. The pretraining and finetuning were both performed on a Tesla V100-PCIE-32GB
GPU, with a maximum sequence length of 128. The pretraining of BioOntoBERT on
ontologygenerated sentences took approximately 10 hours only, whereas the pretraining times for
PubmedBERT and BioBERT were reported as 5 days (120 hours) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and 10 days (240 hours) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],
respectively. For the finetuning process, a batch size of 32 and a learning rate of 1e-5 were
selected. It took approximately 30 hours to complete the finetuning process due to the large
size of the MedMCQA training data.
      </p>
      <p>BioOntoBERT outperformed the baseline BERT-base, achieving a minimum accuracy of
42.72% in 10 runs. Furthermore, BioOntoBERT also outperformed PubMedBERT, which is
pretrained on a huge corpus of biomedical text data. These results indicate that adding ontology
data to the pretraining process can improve the performance of BERT models for medical
question answering.</p>
      <p>The comparison of models in Table 3 highlights the significance of the relatively small
amount of additional ontology data we used to enhance the performance of our model. This
ifnding suggests that the biomedical ontology we injected into the model is highly informative
and beneficial, unlike much of the data in other corpora, which may be considered irrelevant.</p>
      <p>During the evaluation, we also conducted a comparative analysis of the performance of
BioOntoBERT, BERT, and PubmedBERT on various multiple-choice questions across
diferent medical subjects. One evaluated question is in Table 2. Notably, BioOntoBERT correctly
predicted the answer as (A) since the keywords ‘Ameloblastoma’, ‘Adenocarcinoma’,
‘Fibrosarcoma’ and ‘Neoplasia’ are present in the DOID ontology, BioOntoBERT model would have
leveraged this knowledge. Whereas ‘Dentigerous cyst’ is not present in the DOID,
Dentigerous cyst is a type of ‘Odontogenic Cyst’, and DOID contains a reference to ‘Odontogenic
Epithelium’. Odontogenic cysts and Odontogenic epithelium are closely related, as the former
is derived from the remnants of the latter and forms as a result of abnormal developmental
processes during tooth formation. In contrast, both BERT and PubmedBERT predicted the
answer as (D). This demonstrates an example instance of BioOntoBERT utilizing domain-specific
knowledge.</p>
      <p>The results presented in Table 4 demonstrate that BioOntoBERT exhibited superior
performance compared to PubmedBERT across various subjects during pretraining, particularly
when ontology data was available. Subjects like Anatomy, Biochemistry, Dental, Medicine,
and Pathology showed notable improvements by including ontology data. However, for
subjects such as ENT, Microbiology, and Radiology, where no ontology was used in our
experiments, the benefits were not as evident. Additionally, for Pharmacology, Physiology and
Psychiatry, the subject ontologies were not comprehensive enough to contribute significantly to
question-answering capabilities. These findings underscore the significance of incorporating
subject-specific ontology information to enhance the model’s understanding and performance
on domain-specific questions.</p>
      <p>Importantly, we also evaluated the impact of the size and complexity of ontologies on the
performance of the models. Surprisingly, we observed that the size or the number of concepts and
properties in the ontologies did not necessarily correlate with improved question-answering
performance. This suggests that the relevance and quality of the ontology data are crucial
factors in enhancing the model’s understanding and reasoning capabilities rather than the sheer
quantity of information.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>This study introduces the Onto2Sen system, which incorporates annotation-based and
classhierarchical sentences from ontologies to enhance the performance of a language model. It is
the first instance of leveraging such knowledge in pretraining a language model for
biomedical natural language processing tasks. The BioOntoBERT model, pretrained on biomedical
ontologies, outperforms other models, including PubMedBERT, in multiple-choice
questionanswering tasks within the medical domain, efectively capturing medical terminologies. By
achieving improved results with just 158MB of pretraining data, our approach not only
enhances performance but also significantly reduces computational costs, making it a more
sustainable approach to model training.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Future work</title>
      <p>Firstly, the selection and incorporation of appropriate biomedical ontologies remain an ongoing
challenge. While we employed several ontologies in our pretraining process, there are
numerous other ontologies available that could potentially contribute to even better performance.
Secondly, although BioOntoBERT exhibits impressive proficiency in language understanding
and representation, it lacks advanced reasoning capabilities on ontologies. The model
primarily captures contextual relationships and semantic information but does not possess explicit
reasoning mechanisms to infer complex logical connections within ontologies. This
limitation suggests avenues for future research, focusing on incorporating reasoning abilities into
language models trained on biomedical ontologies.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>O.</given-names>
            <surname>Bodenreider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Burgun</surname>
          </string-name>
          , Biomedical ontologies, Medical Informatics:
          <article-title>Knowledge Management and Data Mining in Biomedicine (</article-title>
          <year>2005</year>
          )
          <fpage>211</fpage>
          -
          <lpage>236</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. Zhang,</surname>
          </string-name>
          <article-title>Question answering based on pervasive agent ontology and semantic web</article-title>
          ,
          <source>Knowledge-Based Systems</source>
          <volume>22</volume>
          (
          <year>2009</year>
          )
          <fpage>443</fpage>
          -
          <lpage>448</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Arbaaeen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <article-title>Ontology-based approach to semantically enhanced question answering for closed domain: A review</article-title>
          ,
          <source>Information</source>
          <volume>12</volume>
          (
          <year>2021</year>
          )
          <fpage>200</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yoon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>So</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kang</surname>
          </string-name>
          ,
          <article-title>Biobert: a pre-trained biomedical language representation model for biomedical text mining</article-title>
          ,
          <source>Bioinformatics</source>
          <volume>36</volume>
          (
          <year>2020</year>
          )
          <fpage>1234</fpage>
          -
          <lpage>1240</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Michalopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Kaka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <surname>Umlsbert:</surname>
          </string-name>
          <article-title>Clinical domain knowledge augmentation of contextual embeddings using the unified medical language system metathesaurus</article-title>
          , arXiv preprint arXiv:
          <year>2010</year>
          .
          <volume>10391</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>I.</given-names>
            <surname>Beltagy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cohan</surname>
          </string-name>
          ,
          <article-title>Scibert: A pretrained language model for scientific text</article-title>
          , arXiv preprint arXiv:
          <year>1903</year>
          .
          <volume>10676</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tinn</surname>
          </string-name>
          , H. Cheng, M. Lucas,
          <string-name>
            <given-names>N.</given-names>
            <surname>Usuyama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Naumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Poon</surname>
          </string-name>
          ,
          <article-title>Domain-specific language model pretraining for biomedical natural language processing</article-title>
          ,
          <source>ACM Transactions on Computing for Healthcare (HEALTH) 3</source>
          (
          <issue>2021</issue>
          )
          <fpage>1</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Midhunlal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gopika</surname>
          </string-name>
          ,
          <article-title>Xmqas-an ontology based medical question answering system</article-title>
          ,
          <source>International Journal of Advanced Research in Computer and Communication Engineering</source>
          <volume>5</volume>
          (
          <year>2016</year>
          )
          <fpage>929</fpage>
          -
          <lpage>932</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kwon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-A.</given-names>
            <surname>Jun</surname>
          </string-name>
          , C.-S. Pyo,
          <article-title>Stroke medical ontology qa system for processing medical queries in natural language form</article-title>
          ,
          <source>in: 2021 International Conference on Information and Communication Technology Convergence (ICTC)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>1649</fpage>
          -
          <lpage>1654</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. Le</given-names>
            <surname>Bras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Allaway</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bhagavatula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Lourie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Rashkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Roof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. A.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <string-name>
            <surname>Atomic:</surname>
          </string-name>
          <article-title>An atlas of machine commonsense for if-then reasoning</article-title>
          ,
          <source>in: Proceedings of the AAAI conference on artificial intelligence</source>
          , volume
          <volume>33</volume>
          ,
          <year>2019</year>
          , pp.
          <fpage>3027</fpage>
          -
          <lpage>3035</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Duan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Shou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jiang</surname>
          </string-name>
          , G. Cao,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <article-title>Graphbased reasoning over heterogeneous external knowledge for commonsense question answering</article-title>
          ,
          <source>in: Proceedings of the AAAI conference on artificial intelligence</source>
          , volume
          <volume>34</volume>
          ,
          <year>2020</year>
          , pp.
          <fpage>8449</fpage>
          -
          <lpage>8456</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Caverlee</surname>
          </string-name>
          ,
          <article-title>Infusing disease knowledge into bert for health question answering, medical inference and disease name recognition</article-title>
          , arXiv preprint arXiv:
          <year>2010</year>
          .
          <volume>03746</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Goodwin</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          Demner-Fushman,
          <article-title>Enhancing question answering by injecting ontological knowledge through regularization</article-title>
          ,
          <source>in: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing</source>
          , volume
          <year>2020</year>
          , NIH Public Access,
          <year>2020</year>
          , p.
          <fpage>56</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>L.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Klmo:
          <article-title>Knowledge graph enhanced pretrained language model with fine-grained relationships</article-title>
          ,
          <source>in: Findings of the Association for Computational Linguistics: EMNLP</source>
          <year>2021</year>
          ,
          <year>2021</year>
          , pp.
          <fpage>4536</fpage>
          -
          <lpage>4542</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>K.</given-names>
            <surname>Faldu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kikani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Akbari</surname>
          </string-name>
          , Ki-bert:
          <article-title>Infusing knowledge context for better language and domain understanding</article-title>
          ,
          <source>arXiv preprint arXiv:2104.08145</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Schriml</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Arze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nadendla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-W. W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mazaitis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Felix</surname>
          </string-name>
          , G. Feng,
          <string-name>
            <given-names>W. A.</given-names>
            <surname>Kibbe</surname>
          </string-name>
          ,
          <article-title>Disease ontology: a backbone for disease semantic integration</article-title>
          ,
          <source>Nucleic acids research</source>
          <volume>40</volume>
          (
          <year>2012</year>
          )
          <fpage>D940</fpage>
          -
          <lpage>D946</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ashburner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Ball</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Blake</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Botstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Butler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Cherry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Dolinski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Dwight</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Eppig</surname>
          </string-name>
          , et al.,
          <article-title>Gene ontology: tool for the unification of biology</article-title>
          ,
          <source>Nature genetics 25</source>
          (
          <year>2000</year>
          )
          <fpage>25</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rosse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Mejino</surname>
          </string-name>
          <string-name>
            <surname>Jr</surname>
          </string-name>
          ,
          <article-title>The foundational model of anatomy ontology, in: Anatomy ontologies for bioinformatics: principles and practice</article-title>
          , Springer,
          <year>2008</year>
          , pp.
          <fpage>59</fpage>
          -
          <lpage>117</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>L.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. Y.</given-names>
            <surname>Kang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Pmo: A knowledge representation model towards precision medicine</article-title>
          ,
          <source>Math. Biosci. Eng</source>
          <volume>17</volume>
          (
          <year>2020</year>
          )
          <fpage>4098</fpage>
          -
          <lpage>4114</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>U.</given-names>
            <surname>Visser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Abeyruwan</surname>
          </string-name>
          , U. Vempati,
          <string-name>
            <given-names>R. P.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Lemmon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Schürer</surname>
          </string-name>
          ,
          <article-title>Bioassay ontology (bao): a semantic description of bioassays and high-throughput screening results</article-title>
          ,
          <source>BMC bioinformatics 12</source>
          (
          <year>2011</year>
          )
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>W. D.</given-names>
            <surname>Duncan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Thyvalikakath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Haendel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Torniai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Acharya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Caplan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Schleyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ruttenberg</surname>
          </string-name>
          ,
          <article-title>Structuring, reuse and analysis of electronic dental data using the oral health and disease ontology</article-title>
          ,
          <source>Journal of Biomedical Semantics</source>
          <volume>11</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gündel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Younesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Malhotra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , B. de Bono, H.-T. Mevissen,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hofmann-Apitius</surname>
          </string-name>
          ,
          <article-title>Hupson: the human physiology simulation ontology</article-title>
          ,
          <source>Journal of biomedical semantics 4</source>
          (
          <year>2013</year>
          )
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hastings</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Ceusters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jensen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mulligan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <article-title>Representing mental functioning: Ontologies for mental health and disease (</article-title>
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. K.</given-names>
            <surname>Umapathi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sankarasubbu</surname>
          </string-name>
          ,
          <article-title>Medmcqa: A large-scale multi-subject multichoice dataset for medical domain question answering</article-title>
          ,
          <source>in: Conference on Health, Inference, and Learning</source>
          , PMLR,
          <year>2022</year>
          , pp.
          <fpage>248</fpage>
          -
          <lpage>260</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <article-title>Transfer learning in biomedical natural language processing: an evaluation of bert and elmo on ten benchmarking datasets</article-title>
          , arXiv preprint arXiv:
          <year>1906</year>
          .
          <volume>05474</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>