<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Graphs for Drug Indications with Medical Context</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Reham Alharbi</string-name>
          <email>r.alharbi@liverpool.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Umair Ahmed</string-name>
          <email>umair.ahmed@unicam.it</email>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniil Dobriy</string-name>
          <email>daniil.dobriy@wu.ac.at</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Weronika Łajewska</string-name>
          <email>weronika.lajewska@uis.no</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Menotti</string-name>
          <email>laura.menotti@unipd.it</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohammad Javad Saeedizade</string-name>
          <email>javad.saeedizade@liu.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michel Dumontier</string-name>
          <email>michel.dumontier@maastrichtuniversity.nl</email>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Knowledge Graph Construction, LLMs in KGC, Medical Knowledge Graph</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Linköping University</institution>
          ,
          <addr-line>Linköping</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, University of Liverpool</institution>
          ,
          <addr-line>Liverpool</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Electrical Engineering and Computer Science, University of Stavanger</institution>
          ,
          <addr-line>Stavanger</addr-line>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Department of Information Engineering, University of Padua</institution>
          ,
          <addr-line>Padova</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Institute for Data, Process and Knowledge Management, Vienna University of Economics and Business</institution>
          ,
          <addr-line>Vienna</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Institute of Data Science, Maastricht University</institution>
          ,
          <addr-line>Maastricht</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>School of Science and Technology, University of Camerino</institution>
          ,
          <addr-line>Camerino</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <fpage>26</fpage>
      <lpage>29</lpage>
      <abstract>
        <p>The medical context for a drug indication provides crucial information on how the drug can be used in practice. However, the extraction of medical context from drug indications remains poorly explored, as most research concentrates on the recognition of medications and associated diseases. Indeed, most databases cataloging drug indications do not contain their medical context in a machine-readable format. This paper proposes the use of a large language model for constructing DIAMOND-KG, a knowledge graph of drug indications and their medical context. The study 1) examines the change in accuracy and precision in providing additional instruction to the language model, 2) estimates the prevalence of medical context in drug indications, and 3) assesses the quality of DIAMOND-KG against NeuroDKG, a small manually curated knowledge graph. The results reveal that more elaborated prompts improve the quality of extraction of medical context; 71% of indications had at least one medical context; 63.52% of extracted medical contexts correspond to those identified in NeuroDKG. This paper demonstrates the utility of using large language models for specialized knowledge extraction, with a particular focus on extracting drug indications and their medical context. We provide DIAMOND-KG as a FAIR RDF graph supported with an ontology. Openly accessible, DIAMOND-KG may be useful for downstream tasks such as semantic query answering, recommendation engines, and drug repositioning research.</p>
      </abstract>
      <kwd-group>
        <kwd>Drug Indications</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Drug indications are regulatory-approved uses for a medicine. The indication of a drug, along
with its medical context better defines its therapeutic intent and provides additional prescribing
guidance for physicians. Valid medical contexts include the underlying medical illness for
the target condition (“Quetiapine tablets are indicated for the acute treatment of manic
episodes associated with bipolar I disorder”), the age group (“EMGALITY is indicated
for the treatment of episodic cluster headache in adults”), and co-therapies, i.e. drugs that
should be administered in combination, (“Clonidine hydrochloride injection is indicated in
combination with opiates for the treatment of severe pain in cancer patients”). The drug
indication and its medical context are contained within drug product labels whose contents are
subject to approval by regulatory agencies such as the U.S. Food and Drug Administration (FDA)
or the European Medicines Agency (EMA).</p>
      <p>
        Machine-readable representations of drug indications are key in computational drug
discovery [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and clinical decision-making [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. However, while databases have been created to
store drug indications in a machine-readable manner (e.g. DrugCentral [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]), these either do not
contain their medical context and as such do not correctly cover the therapeutic intent, or the
medical context is available in some form of natural language and is not available for
computation. The lack of a computable medical context will necessarily limit the development of accurate
methods to predict new drug uses [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or make treatment recommendations. As a case example,
the Oncology Expert Advisor system derived from IBM Watson was pulled from the market
after making unsafe suggestions relating to cancer treatment. 1 Thus, medical context should be
taken into account while creating accurate, precise, and compute-accessible representations
of drug indications. Knowledge Graphs (KGs) serve as a natural and intuitive way to store,
query, and explore structured knowledge. However, building high-quality knowledge graphs
requires substantial manual efort. Thus, automated Knowledge Graph Construction ( KGC)
methods have emerged to alleviate the burden of manual data curation [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. To this end, Large
Language Models (LLMs) are promising technologies for natural language understanding, and in
particular, for constructing knowledge graphs. Recent unpublished work suggests that carefully
engineered prompts have the potential to extract entities and their relations in a structured
manner [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. To the best of our knowledge, there is no method that correctly extracts the
medical context for drug indications.
      </p>
      <p>Contributions In this work, we propose a novel approach to use a LLM that extracts drugs,
their indications, and the associated medical context to a target Knowledge Graph. We aim
to answer the following research questions: ”RQ1: To what extent does the addition of more
instruction to the LLM yield a more accurate/complete extraction of the context?”, ”RQ2: How
many drug indications include a medical context?” and ”RQ3: What is the quality of the generated
knowledge graph?”. The main contributions of the research are as follows: (i) the development
of a LLM-based framework to extract triples and their context, perform entity recognition, and
produce a valid RDF graph (ii) the use of the framework to extract drug indications and their
medical context from sentences in natural language (iii) an evaluation of prompts that vary in
1https://www.statnews.com/2018/07/25/ibm-watson-recommended-unsafe-incorrect-treatments/
their specificity (iv) an estimate of the scale of the problem of recognising and representing
medical context</p>
      <p>The rest of the paper is structured as follows: Section 2 reports previous eforts in this task and
useful resources. Section 3 defines the proposed framework and all its components. Section 4
analyzes the results and provides an empirical evaluation of our method. Finally, Section 5
concludes the paper with some final remarks.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>
        DailyMed2 is a repository of drug labels approved by the FDA and hosted by the National
Institute of Health (NIH). It contains a wide range of information about drug indications and
contradictions, and is available as XML files available through an API. The task of synthesizing
information about the therapeutic intent has been addressed in several works. Drugbank [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
is a curated resource for detailed drug (i.e. chemical) data along with their drug targets (i.e.
protein), but the indication, if present, is solely expressed in natural language. DrugCentral [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
provides drug information along with indications extracted from product labels, however, not
all indications are coded and there is no additional medical context available. Névéol and Lu [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
focus on automatically extracting and integrating drug indication information from multiple
health resources such as DailyMed and MeSH Scope notes. MEDI [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] applies NLP and ontology
relationships to extract indications for single-ingredient medications. Prompt-based methods
using LLMs are being explored for clinical concept extraction [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and Drug-Drug interaction
triplets extraction [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. To the best of our knowledge, none of the automated methods extracting
drug indications take into account the medical context.
      </p>
      <p>
        Several eforts have been directed towards semi-automated curation of drug indications.
LabeledIn [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] is a human-reviewed, machine-readable, and source-linked catalogue of 7805
indications for 250 human drugs. However, it does not curate medical context. InContext [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]
is a curated set of indications and their medical context for 150 drugs. The dataset was curated
by annotating Dailymed HTML pages using Hypothesis.is [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and the National Center for
Biomedical Ontologies (NCBO) BioPortal annotator [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] for concept recognition. InContext
defines five medical contexts: (i) Co-prescribed medication - drugs commonly prescribed together;
(ii) Co-therapies - procedures or therapies (not drug-related, i.e. radiotherapy) that should be
applied in combination with the drug; (iii) Co-morbidities - diseases or conditions that commonly
occur together (with a target condition) in the same patients; (iv) Genetics - genetic variants
for a given disease; (v) Temporal aspects - information that explains at what life stage, disease
stage, or treatment phase a drug should be administered. NeuroDKG [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] is a Knowledge
Graph containing drug indications and their medical context for 101 drugs (from a total of 174
sentences) that target neurological disorders.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed approach</title>
      <p>We propose a framework to transform a textual description of drug indications with medical
context into a knowledge graph using LLM-powered entity recognition along with identifier</p>
      <sec id="sec-3-1">
        <title>2https://dailymed.nlm.nih.gov/dailymed/</title>
        <sec id="sec-3-1-1">
          <title>3.1. LLM-based entity recognition</title>
          <p>We designed a set of prompts to guide a LLM to identify and extract entities of interest. In
particular, we focus on extracting contextual information from sentences to enrich drug-disease
interaction triples. The prompts contain a Paragraphs section and a Instructions section. Each
prompt is extended with additional specification to return a defined JSON-formatted object.
The prompts’ instructions range from general instructions (prompt 1) to specific (prompt 3).
The first prompt, called ““ Triple Prompt” extracts subject-verb-object triples from a given text.
Subsequently we developed the “Context Prompt”, which identifies the context from a given
text and enriches the original triples with such information. Finally, the “Medical Context
Prompt” identifies specific medical context types. This prompt contains a Definition of context
types section within the Instructions section. All prompts and the complete list of predicates
describing the medical context for drug indications are available in the DIAMOND-KG GitHub
repository in the “Supplemental Material” directory3.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.2. Identifier Resolver</title>
          <p>The values for each context type are then grounded to database/ontology identifiers using the
NIH NCATS Translator SRI Name Resolution API 4. The name resolution service takes lexical
strings and attempts to map them to identifiers (CURIES; composed of a prefix for the source
followed by a delimiter followed by the resource identifier) from a vocabulary or ontology. The
lookup is not exact but includes partial matches. For each entity mention, we obtain a list of 5
results representing possible conceptual matches, of which the first is the preferred choice, and
the remainder are ranked by the next preferred resource.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3https://github.com/semantisch/diamond-kg/tree/main/Supplementary%20Material 4https://name-resolution-sri.renci.org/docs</title>
        <sec id="sec-3-2-1">
          <title>3.3. RDF Graph Generation</title>
          <p>
            The last component of the framework is the generation of an RDF graph from the JSON result
containing the extracted and named entities and relations. We define the DKG namespace 5
and iteratively process the sentences from the JSON results, constructing a graph that follows
a lightweight Diamond-KG ontology described below to represent the sentences and their
semantic components. For each sentence (a dkg:sentence) and each identified component of a
sentence (a dkg:part), we mint a unique IRI in the DKG namespace, using a hashing algorithm
for collision resistance, resuting in dkg:[HASH] and dkg:part/[HASH] minted IRIs respectively,
and assign them corresponding label values via rdfs:label. Next, we relate the components to
their sentence of origin via a dkg:hasPart relation. In the case of prompt 1, the components
form asserted triples. For prompt 2 and prompt 3, we classify the components according to their
context type. Namely, utilizing the IRIs dkg:freecontext/[LABEL] for the prompt 2 contexts and
dkg:definedcontext/[LABEL] for the prompt 3 contexts respectively. Finally, the components are
grounded to the database/ontology identifiers (from the NCATS Translator SRI Name Resolution
service) via a skos:closeMatch relation. The resulting graph is made publicly available in Turtle
format6 and documented following the FAIR (Findable, Accessible, Interoperable, Reusable) Data
Guidelines [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ]. Specifically, each entity is identified using a unique Uniform Resource Identifier
(URI) for each semantic component within the graph. To ensure data is both interoperable
and reusable, the graph is enriched with comprehensive metadata that adheres to a set of
standard vocabularies. The resulting DIAMOND-KG comprises 12,363 triples, 1,186 entities,
435 predicates, and 148 classes. While the underlying DIAMOND-KG ontology totals 15 classes
(namely, dkg:Sentence, dkg:Context, dkg:Free_context, dkg:Defined_context as well as the 11
defined contexts) and 1 predicate ( dkg:hasPart), the majority of DIAMOND-KG classes (133)
stem from the free contexts generated in prompt 2 and the majority of predicates (430) from the
asserted triples generated in prompt 1.
          </p>
          <p>The implementation is in Python 3.6 and published as a GitHub repository7 under the MIT
license8. We use OpenAI gpt-4 model API as the LLM. The number of maximum requested
tokens (with 1 token approximately corresponding to 4 chars of English text9) is set to 6,144
(current maximum value for the gpt-4 model) to allow batch prompts containing many
paragraphs as input. The paragraphs are processed individually (1 paragraph per prompt) as batch
processing for this prompt has been found to lead to complex interactions in the prompt results
due to context categories being conceptualised for the batch of sentences together. We use
RDFLib 6.3.2 to generate the RDF graph.
5Full IRI: http://purl.org/dkg/v1/
6https://huggingface.co/datasets/um-ids/diamond-kg
7Available at https://github.com/semantisch/diamond-kg
8Refer to https://opensource.org/licenses/MIT
9For detailed token calculation
4936856-what-are-tokens-and-how-to-count-them
refer
to:
https://help.openai.com/en/articles/</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussion</title>
      <sec id="sec-4-1">
        <title>4.1. Experimental Setup</title>
        <p>
          We compare our results to the NeuroDKG dataset [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], which contains 174 sentences concerning
indications of neurological drugs. The manually-annotated triples are subsequently used to
build a KG about drug indications with medical context which comprises 2,397 triples, 460
entities, 13 properties, and 10 classes10. To provide a comparable ground truth, we restricted
NeuroDKG to consider only triples of interest, i.e. those in common with DIAMOND-KG. This
step resulted in a dataset of 510 triples, with an average of 3.0 triples for each sentence. The
complete list of selected context-related predicates from NeuroDKG that can be mapped to the
medical context extracted by the DIAMOND-KG prototype is available in our GitHub repository
in the “Supplemental Material” directory11. In addition, we map NeuroDKG’s “disease” predicate
to “target” in DIAMOND-KG, since they both represent the medical condition that is targeted
by the considered drug. As far as context is concerned, 64.11% of sentences from NeuroDKG
contain at least one triple representing medical context.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Prompts Analysis</title>
        <p>For each prompt, Table 1 shows the number of triples relating to context that can be successfully
extracted from sentences, the average number of triples per sentence, and the percentage of
sentences that contain medical context information. The latter column is applicable only to the
third prompt, as we provide a predefined set of predicates to extract. We also report the same
information for the NeuroDKG dataset as a comparison. For comparability, we only include
triples directly relating to context and do not include those triples that arise from our modelling
choices in the construction of DIAMOND-KG (i.e., triples relating semantic components to
sentences, labelling, entity typing and rdfs:subClassOf assertions).</p>
        <p>The first prompt generates triples from a given text in a JSON format, without providing
any domain-specific information. With this approach, we are able to extract 442 triples, with
an average of 2.11 triples per sentence which is the lowest among the diferent prompts. This
result can be attributed to the general nature of the instruction which may afect the system’s
10The NeuroDKG Knowledge Base is available in zenodo at https://doi.org/10.5281/zenodo.5541440
11https://github.com/semantisch/diamond-kg/tree/main/Supplementary%20Material
144
69
neuroDKG
diamondKG
0 target symptom garoguep tahd9ejuranpcyt co-m8orb7idity trdeu7artamtioennt co-ther7apy com-perd0eicsacrtiiboned con0ditional the0praasp6ties team0sppeo4crtal ge0net0ics
Figure 2: Context information distribution from prompt 3 results. For each context entity, we report
the number of sentences for which such information has been extracted by the prompt.
performance. The second prompt tests the power of generative AI to extract context information
from text, without providing any specific types of context. As reported in Table 1, prompt 2 is
able to extract 911 triples, with an average of 4.82 triples per sentence. The LLM-based Entity
Recognition module identifies 140 diferent context types ranging from domain-specific entities
such as “medication”, “treatment”, and “symptom” to more general entities like “demographics”
or “Publication”. We discover that most predicates are synonyms, e.g. “treatment” and “Medical
treatment”, or they are written in diferent ways, like “Medical Treatment” and
“Medical_treatment” are considered two distinct contexts. This situation can lead to inconsistency among the
KG’s predicates and increases the possibility of duplicate information. To mitigate this issue,
additional post-processing steps will be needed between the output of the LLM-based Entity
Recognition module and the construction of the KG. The third prompt includes domain-specific
information and the set of context entities we plan to extract. With this approach, we are able
to extract 621 triples, with an average of 3.32 extracted triples for each sentence. Limiting the
list of possible predicates results in a decrease in the number of extracted triples, which may
also lead to lost information. Section 4.3 shows that we only lose little information with this
approach. The third prompt identifies 71.12% sentences containing medical context, which
means that we are able to extract context information for more sentences than the NeuroDKG
dataset.</p>
        <p>Figure 2 reports the distribution of the diferent predicates across the dataset, compared to the
NeuroDKG output. Due to its nature, the NeuroDKG dataset extracts context information for
the first seven context types. Overall, DIAMOND-KG extracts more context information than
NeuroDKG in most cases. We have a similar number of context information extracted for types
“co-morbidity” and “co-therapy”. The main diference between the two refers to the “target”
and “symptom” context types. In the former case, NeuroDKG extracts such information for 176
sentences, compared to the 126 identified by DIAMOND-KG. This behaviour can be attributed
to the variety of context types available to the DIAMOND-KG system. To this end, we analyzed
the diference in values for the “target” context type and discovered that some triples labelled
as “target” in NeuroDKG are identified by DIAMOND-KG with some other context types, such
as “symptom” or “co-morbidity”. Take the sentence “Xyrem is indicated for the treatment of
cataplexy or excessive daytime sleepiness (EDS) in patients 7 years of age and older with narcolepsy”
as an example. NeuroDKG identifies as target “narcolepsy” and as symptom “cataplexy”. On
the other hand, DIAMOND-KG correctly classifies “cataplexy” as a symptom but “narcolepsy”
is recognized as a co-morbidity. This classification is not entirely incorrect, indeed “narcolepsy”
is not the disease targeted by the drug, which treats “cataplexy”. In general, DIAMOND-KG
assigns diferent context types for 120 medical contexts compared to NeuroDKG. Additionally,
given the same type assigned, in 82 cases the identified medical context by DIAMOND-KG is
broader and more informative than the one provided in NeuroDKG.</p>
        <p>Most sentences were associated with the following context entities: ’co-prescribed
medication’ (144 sentences), ’target’ (126 sentences), ’conditional’ (69 sentences) and ’age group’ (69
sentences). The least amount of sentences were associated with ’past therapies’, ’temporal
aspects’, and ’genetics’, which are present in 6, 4, and 0 sentences respectively. We analyzed
the sentences and confirmed that this trend represents the real distribution of the dataset.
Indeed, fewer indications present information related to past therapies or at what life stage or
disease stage a drug should be administered (i.e. temporal aspects). About genetics, we found
no information related to such context type in any sentence. This could be due to the nature of
the dataset, i.e. neurological drugs.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Emprical Evaluation of the Third Prompt</title>
        <p>As we discussed above, NeuroDKG is not suficient to evaluate the results of DIAMOND-KG as it
contains fewer context types, and in most cases, our system seems to provide more informative
triples than NeuroDKG. To evaluate the quality of the information extracted by prompt 3, we
manually annotated all sentences in NeuroDKG considering all context types in DIAMOND-KG,
and compared our ground truth with the output of our system. For each (context value, context
type) pair, we are interested in whether our system is able to extract meaningful information
and classify them with the correct context type. Overall, DIAMOND-KG achieved an accuracy
of 63.52% throughout all context types, with a hamming loss of 4.20%. We identified three
common errors: “wrong pairs” (18.24%) are those that are not present in the ground truth,
“misclassified pairs” (9.77%) are those present in the ground truth but with a diferent context
type, and “missing pairs” (8.47%) are pairs that are present in the ground truth but not in the
DIAMOND-KG’s output. Table 2 reports the performance metrics for each context type, which
varies in terms of their value as well as their occurrence. Context types with scores about
70% also exhibit reasonable support, e.g. “target”, “age group”, and “symptom”. The method
performs well on identifying the “age group” suited for a given drug, with precision, recall,
and F1-score above 90%. The lowest results are registered on the context types with the lowest
support, where few wrong pairs have a higher impact on performance. These findings may
indicate that some context types are underrepresented in the dataset. Recall is above 70% in
most cases, except for “temporal aspect” (67%), ‘‘co-morbidity” (28%), and “past therapies”
(17%) which all exhibit low support. A high recall confirms that the method is able to return
most of the relevant pairs, meaning that providing a predefined set of context types does not
hinder the system or cause loss of information.</p>
        <p>(a) Precision, Recall, and for each context type.</p>
        <p>Target
Symptom
Age Group
Adj. Therapy
Co-morb.</p>
        <p>Treat. Duration
Co-therapy
Co-presc. Med.</p>
        <p>Conditional
Past Therapies
Temp. Aspects</p>
        <p>Prec</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>(b) Precision, Recall, and for the average results.</p>
      <p>Micro Avg.</p>
      <p>Macro Avg.</p>
      <p>Weight. Avg.</p>
      <p>Samples Avg.</p>
      <p>Prec
We explore a novel approach that leverages LLMs to extract relevant information and the
associated medical context from drug indications. To the best of our knowledge, this is the first efort
to extract such information by means of LLMs. The prototype system called DIAMOND-KG uses
a LLM to recognize entities, which are subsequently passed to a service to perform identifier
mapping, and the final step creates a FAIR RDF knowledge graph that complies. In relation to
RQ1, we find that the refinement of the contexts produces higher quality outcomes to manually
curated datasets. Moreover, it identifies a broader set of contexts and more informative results.
While this framework ofers a promising approach to automatically extract drug indications
and their medical context, it also raises the possibility for this framework to accurately and
systematically extract a wide variety of contextual information for other context-dependent
settings. In relation to RQ2, based on the set of sentences annotated in NeuroDKG, we find that
at least 71.12% of sentences have at least one context, which is greater than the 64.11% reported
in the manually annotated NeuroDKG. This result indicates that a significant proportion of
drug indications do contain a medical context, and the framework is able to identify these to a
greater extent than manual curation. In relation to RQ3, the quality of the extraction varies
based on the context type, but we mainly attribute such oscillation to the diference in the
support. DIAMOND-KG achieved a high recall, demonstrating that the system extracts a high
portion of relevant information and experiences little information loss.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This project was initiated through the participation of the International Semantic Web Research
Summer School (ISWS 2023). We wish to acknowledge the outstanding support received from
School’s organizers Valentina Presutti and Harald Sack, and from our assistant tutor Oleksandra
Bruns. MD and UA were supported by the European Union’s Horizon 2020 research and
innovation programme under the Marie Skłodowska-Curie Actions (MD grant agreement No
860801; UA grant agreement No 955569). LM is supported by the HEREDITARY Project, as
part of the European Union’s Horizon Europe research and innovation programme under grant
agreement No GA 101137074. RA is supported by a PhD studentship from Taibah University,
Saudi Arabia, and the Saudi Arabian Cultural Bureau (SACB) in London.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. I.</given-names>
            <surname>Avram</surname>
          </string-name>
          , et al.,
          <article-title>Drugcentral 2021 supports drug discovery and repositioning</article-title>
          ,
          <source>Nucleic Acids Research</source>
          <volume>49</volume>
          (
          <year>2020</year>
          )
          <fpage>D1160</fpage>
          -
          <lpage>D1169</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Nelson</surname>
          </string-name>
          , et al.,
          <article-title>Formalizing drug indications on the road to therapeutic intent</article-title>
          ,
          <source>JAMIA</source>
          <volume>24</volume>
          (
          <year>2017</year>
          )
          <fpage>1169</fpage>
          -
          <lpage>1172</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Marchesin</surname>
          </string-name>
          , et al.,
          <article-title>Building a large gene expression-cancer knowledge base with limited human annotations</article-title>
          ,
          <source>Database J. Biol. Databases Curation</source>
          <year>2023</year>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Trajanoska</surname>
          </string-name>
          , et al.,
          <article-title>Enhancing knowledge graph construction using large language models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2305</volume>
          .
          <fpage>04676</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Caufield</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Ontogpt</surname>
          </string-name>
          ,
          <year>2023</year>
          . URL: https://monarch-initiative.github.io/ontogpt/.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Wishart</surname>
          </string-name>
          , et al.,
          <article-title>Drugbank: a comprehensive resource for in silico drug discovery and exploration</article-title>
          ,
          <source>Nucleic Acids Research</source>
          <volume>34</volume>
          (
          <year>2005</year>
          )
          <fpage>D668</fpage>
          -
          <lpage>D672</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Névéol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <article-title>Automatic integration of drug indications from multiple health resources</article-title>
          ,
          <source>in: Proc. of the 1st ACM international health informatics symposium</source>
          ,
          <year>2010</year>
          , pp.
          <fpage>666</fpage>
          -
          <lpage>673</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>W. Q.</given-names>
            <surname>Wei</surname>
          </string-name>
          , et al.,
          <article-title>Development and evaluation of an ensemble resource linking medications to their indications</article-title>
          ,
          <source>JAMIA</source>
          <volume>20</volume>
          (
          <year>2013</year>
          )
          <fpage>954</fpage>
          -
          <lpage>961</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Peng</surname>
          </string-name>
          , et al.,
          <article-title>Clinical concept and relation extraction using prompt-based machine reading comprehension</article-title>
          ,
          <source>JAMIA</source>
          <volume>30</volume>
          (
          <year>2023</year>
          )
          <fpage>1486</fpage>
          -
          <lpage>1493</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H.</given-names>
            <surname>Hu</surname>
          </string-name>
          , et al.,
          <article-title>A generative drug-drug interaction triplets extraction framework based on large language models</article-title>
          ,
          <source>Proc. of the Association for Information Science and Technology</source>
          <volume>60</volume>
          (
          <year>2023</year>
          )
          <fpage>980</fpage>
          -
          <lpage>982</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.</given-names>
            <surname>Khare</surname>
          </string-name>
          , et al.,
          <article-title>Labeledin: Cataloging labeled indications for human drugs</article-title>
          ,
          <source>Journal of biomedical informatics 52</source>
          (
          <year>2014</year>
          )
          <fpage>448</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>K.</given-names>
            <surname>Moodley</surname>
          </string-name>
          , et al.,
          <article-title>InContext: curation of medical context for drug indications</article-title>
          ,
          <source>Journal of Biomedical Semantics</source>
          <volume>12</volume>
          (
          <year>2021</year>
          )
          <article-title>2</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Hypothesis</surname>
          </string-name>
          , Hypothesis.is - Open Annotation Tool,
          <year>2023</year>
          . URL: https://web.hypothes.is, accessed:
          <fpage>2023</fpage>
          -06-13.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>BioPortal</surname>
          </string-name>
          , Bioportal annotator,
          <year>2023</year>
          . URL: https://bioportal.bioontology.org/annotator, accessed:
          <fpage>2023</fpage>
          -06-13.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          , et al.,
          <article-title>Publishing Medical Context of Neurological Drug Indications as a Knoweldge Graph</article-title>
          ,
          <source>Technical Report</source>
          , Institute of Data Science, Maastrich University, Maastricht, the Netherlands,
          <year>2021</year>
          . URL: https://github.com/MaastrichtU-IDS/neuro_dkg/blob/master/ publication.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>M. D. Wilkinson</surname>
          </string-name>
          , et al.,
          <article-title>The fair guiding principles for scientific data management and stewardship</article-title>
          ,
          <source>Scientific data 3</source>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>