<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>biomedical journal articles. More recently</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1093/database/baw067</article-id>
      <title-group>
        <article-title>Graph of Causal Relations in Drug Reviews</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vanni Zavarella</string-name>
          <email>vanni.zavarella@unica.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorenzo Bertolini</string-name>
          <email>lorenzo.bertolini@ec.europa.eu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergio Consoli</string-name>
          <email>sergio.consoli@ec.europa.eu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gianni Fenu</string-name>
          <email>gianni.fenu@unica.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Diego Reforgiato Recupero</string-name>
          <email>diego.reforgiato@unica.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Zani</string-name>
          <email>alessandro.zani@ec.europa.eu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Causality, Large Language Models, Knowledge Graphs, Clinical NLP, Instruction fine-tuning</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Mathematics and Computer Science, University of Cagliari</institution>
          ,
          <addr-line>Cagliari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>European Commission, Joint Research Centre (JRC)</institution>
          ,
          <addr-line>Ispra</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <volume>1</volume>
      <fpage>885</fpage>
      <lpage>892</lpage>
      <abstract>
        <p>This paper presents the employment of JSL-MedLlama, a decoder-only Large Language Model (LLM) trained within the medical domain, to create a knowledge graph of causal relationships from drug reviews. We leverage a dataset of causal narratives from clinical notes, MIMICause, to benchmark JSL-MedLlama for classifying causal narratives using instruction fine-tuning. The results show that it obtains satisfying performance, outperforming other encoder-only baselines. Furthermore, we validate our algorithm robustness and cross-domain generalization by testing it on the Drug Reviews dataset, a collection of patient reviews on specific drugs along with related conditions. We then deploy the model on a subset of around 19,000 Drug Reviews, generating a knowledge graph of 3,050 unique triples connecting 1,149 Drugs and 322 Conditions through the considered causal relations. The results highlight the role of decoder-only LLMs, fine-tuned within the biomedical domain, in advancing causal reasoning and generating valuable resources for real-world biomedical use cases. We make publicly available the drug-condition causal relation knowledge graph to support future research eforts in the field.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Causal relation extraction (CRE), the task of identifying causal relationships between events or entities
in text is critical to advance knowledge discovery in the biomedical domain [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Causal reasoning
methodologies can be broadly classified into two broad paradigms: qualitative and quantitative.
Qualitative approaches predominantly conceptualize causal reasoning as a classification task. In contrast,
quantitative methods leverage ad-hoc metrics to quantify causal strength, systematically accounting for
the inherent uncertainties that pervade causal inference [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Extracting causal relationships from a range of diverse unstructured observational data, including
electronic health records (EHRs), clinical notes, and online drug reviews, can serve as valuable sources
for causal inference experiments, allowing researchers and healthcare professionals to identify potential
risk factors, understand disease progression, and assess treatment efectiveness [
        <xref ref-type="bibr" rid="ref2 ref4 ref5 ref6">2, 4, 5, 6</xref>
        ]. However,
manually analyzing vast amounts of biomedical literature and clinical texts is infeasible, requiring
automated approaches for the extraction of causal relationships [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        In the biomedical domain, several specialized datasets have been introduced. The ACE corpus [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
consists of MEDLINE case reports annotated with mentions of drugs, adverse efects, dosages, and their
interrelations. Similarly, BioCause [9] annotates 851 causal relations extracted from 19 open-access
      </p>
      <p>CEUR</p>
      <p>ceur-ws.org
form of biological knowledge graphs, capturing causal and correlative relationships between entities
using BEL (Biological Expression Language) statements [11].</p>
      <p>Despite its significance, achieving robust and generalizable performance in CRE remains challenging
due to the complexity, variability, and ambiguity of biomedical texts [12, 13, 14]. In recent years, Large
Language Models (LLMs) have emerged as powerful tools for solving various NLP tasks, demonstrating
remarkable capabilities in understanding and generating text across multiple domains [15, 16, 17].
LLMs, including transformer-based architectures such as GPT [18] and BERT [19] derivatives, have
shown promise in improving CRE by leveraging vast biomedical corpora and pre-trained knowledge to
recognize complex causal relationships [20].</p>
      <p>This paper presents a qualitative approach to causal reasoning by adopting JSL-MedLlama and
ifne-tuning it using the MIMICause dataset [ 21], a widely recognized resource for extracting causal
relationships in clinical text. We experiment with instruction fine-tuning techniques and compare
the resulting model against two strong baselines based on BERT and Clinical-BERT encoders. Our
ifndings show that the tested decoder-only model, fine-tuned on domain-specific biomedical data and
further adapted by us to the target task through instruction tuning, achieved satisfying performance
and outperformed the considered encoder-only baselines.</p>
      <p>Furthermore, we tested our algorithm’s robustness and cross-domain generalization on the Drug
Reviews dataset, a collection of patient reviews on specific drugs and related conditions. To validate the
extracted causal relationships, we annotated a subset of identified instances, achieving high accuracy
and strong inter-annotator agreement, confirming the reliability of our approach and its adaptability to
real-world biomedical scenarios. Therefore, we deployed the model on a subset of the Drug Reviews
dataset, generating a knowledge graph of triples connecting Drug and Condition type entities through
four types of causal relations and making the resource publicly available.</p>
      <p>The remainder of this paper is structured as follows. Section 2 introduces the task addressed in this
work and the dataset used to train our model. Section 3 presents the methodology used to classify
causal relations and how it is deployed on the Drug Reviews dataset (Section 3.1). In Section 3.2, we
describe the knowledge graph constructed from this dataset and provide relevant analytics. Finally,
Section 4 concludes the paper with a summary of findings and directions for future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Dataset and Task Definition</title>
      <p>We train our models to identify causal narratives within clinical notes using the MIMICause dataset [21]1.
The MIMICause dataset is derived from a collection of de-identified discharge summaries sourced from
the MIMIC-III (Medical Information Mart for Intensive Care-III) clinical database [22]2, which were
annotated for nine types of biomedical entities (Drug, ADE, Reason, Dosage, Strength, Form, Frequency,
Route and Duration).</p>
      <p>The MIMICause annotation schemas defines that “a causal relationship/association exists when one or
more entities afect another set of entities” [21]. Eight directed relation types between two entities e1 and
e2 are defined, where the order of the entity tags determines the direction of causality: Cause(e1,e2),
Cause(e2,e1), Enable(e1,e2), Enable(e2,e1), Prevent(e1,e2), Prevent(e2,e1), Hinder(e1,e2), Hinder(e2,e1).
Additionally, the Other class encompasses instances where either a non-causal interaction or no relationship
at all exists between a given pair of biomedical entities. For more details on the definitions of the causal
relation schema, refer to the original paper [21].</p>
      <p>Causal relationships can link entity pairs within the same sentence or, in rare cases, spanning a few
sentences in the input text. These relationships may be explicitly signaled by lexical causal connectives,
such as “due to”, or they may be implicit, requiring inference from the broader context. The MIMICause
dataset comprises 2,714 examples, with a train-dev-test split of 1,953 for training, 493 for development,
and 268 for testing.</p>
      <sec id="sec-2-1">
        <title>1https://huggingface.co/datasets/pensieves/mimicause 2Harvard’s DBMI Data Portal: https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp/</title>
        <p>The task of identifying causal relations is formulated as a single-label multi-class relation classification
problem:
 ∶ ( ,</p>
        <p>1,  2) → 
 = [ 1,  2, ... −1 ,   ],
 1 =  [ ∶ ]
 2 =  [ ∶ ]
with  ≤  and ,  ∈ [1..],
with  ≤  and ,  ∈ [1..],
 &lt;  or  &lt; 
where  ∈ [0, ...9] is the relation label (4 symmetrical relations plus the Other category),  is an input
text sequence,  1 and  2 are non-overlapping, continuous token subsequences of  representing the
entities between which the causal relation is to be identified (either entity can precede the other).</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <p>We perform instruction fine-tuning for classifying causal relations using a SOTA open-source LLM
with decoder-only architecture. The reference baselines are the two SOTA encoder-only architectures
described in [21], both leveraging BERT-based text encoders combined with fully connected feedforward
network (FFN) classifier layers. We will refer to them as BERT+Ent and Clinical-BERT+ENT. Among
these baselines, the architecture incorporating the domain-specific Clinical-BERT encoder, denoted as
Clinical-BERT+Ent in Table 1, yields the best performance.</p>
      <p>For our experiments, we use johnsnowlabs/JSL-MedLlama-3-8B-v2.0 (shortened as JSL-MedLlama)3,
an advanced model developed by John Snow Labs on top of the Llama-3-8B architecture and specifically
tailored for medical and healthcare applications, having undergone fine-tuning on extensive medical
literature and datasets4. The model is accessible through Hugging Face via the Transformer library,
thus making our study fully reproducible.</p>
      <p>For our instruction fine-tuning implementation, we first transformed the MIMICause training split
into instruction prompts, which include for each training instance references to the e1 and e2 input
entities; then, we fine-tuned our model on the resulting instruction dataset using the trainer class5
from Hugging FaceGiven the computational limitations of fully fine-tuning large generative models, we
employed the Low-Rank Adaptation (LoRA) technique for Parameter-Eficient Fine-Tuning [ 23]. The
resulting model is renamed as CLiMA (Causal Linking for Medical Annotation).</p>
      <sec id="sec-3-1">
        <title>Model</title>
        <p>BERT+Ent
L
B Clinical-BERT+Ent
FT CLiMA</p>
      </sec>
      <sec id="sec-3-2">
        <title>Cause</title>
      </sec>
      <sec id="sec-3-3">
        <title>Enable</title>
      </sec>
      <sec id="sec-3-4">
        <title>Prevent</title>
      </sec>
      <sec id="sec-3-5">
        <title>Hinder</title>
      </sec>
      <sec id="sec-3-6">
        <title>Other Macro F1</title>
        <p>0.85
0.77
0.845
0.8
0.89</p>
        <sec id="sec-3-6-1">
          <title>3https://huggingface.co/johnsnowlabs/JSL-MedLlama-3-8B-v2.0</title>
          <p>4We opt for using a small-range model in order to operate within the constraints of limited compute resources. We train and
run model inferences on a single A100 GPU with 40GB SDRAM, applying 4-bit quantization.
5https://huggingface.co/docs/transformers/en/main_classes/trainer
i had a urinary tract infection so bad that when i pee it smells but
when i started taking ciprofloxacin it worked it’s a good medicine
for a urinary tract infections.
i tried the nuvaring. this was my first form of any birth control. this
was very easy to put inside and very easy to take out. i didn’t feel
the ring ever. i thought it was amazing until i started to get huge
deep pimples. they were impossible to get rid of.
when i first started using ziana, i only had acne in between my
eyebrows, chin, and the nose area. my acne worsened while using
it and then it got better. but after about 4 months of using it, it
became inefective. so i now have acne between my eyebrows, chin,
cheeks, forehead, and the nose area. its great at first but after a
while it made my face even worse than before i used the product.</p>
          <p>Across relation classes, the model exhibits a significantly lower performance for Enable and Hinder,
which tend not to be distinguished from Cause and Prevent, respectively6.</p>
          <p>We make publicly available the model as LORA adapters, with associated training scripts and
hyperparameters settings, in the Hugging Face repository: https://huggingface.co/unica/CLiMA.
3.1. Causal Relations from Drug Reviews
We evaluated the cross-domain generalization capabilities of the tested fine-tuned model by deploying it
on the open-source dataset of Drug Reviews (Druglib.com)7 available within the UCI Machine Learning
Repository8. The Drug Reviews dataset contains around 215 thousands patient reviews on specific
drugs along with related conditions, crawled from online pharmaceutical review sites. This dataset is
distributed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which allows
for the use, sharing and adaptation of the data for research purposes.</p>
          <p>While similar in topic, the reviews in Drug Reviews are diferent in language style from MIMICause,
as they contain slang and are not curated. This allows us to test the robustness of our model on the
causal relation extraction task. In the Drug Reviews dataset, the target Drug and Condition metadata
entities are not always explicitly mentioned in the review text. In order to remain compliant with the
instruction prompt settings of our fine-tuned JSL-MedLlama model, we first filtered a subset of around
19,200 items from Drug Reviews where both Drug and Condition entities are matched within the text.
Table 2 lists a few examples of reviews from this subset. Subsequently, for our evaluation we deployed
the model on a randomly selected sample of 40 reviews for each possible relation: “Cause”, “Prevent”,
“Hinder”, “Enable” and “Other”, yielding an overall set of 200 relations to be validated.</p>
          <p>We evaluated the correctness and directionality of the extracted causal relations, involving three
annotators per relation class. The annotators assessed whether the relation was correct, with options
being True, if the relation (E1 causal_rel E2) was supported by the text, False if not, or Swapped Entities
if the relation was correct but with opposite direction (E2 causal_rel E1).</p>
          <p>We calculated the average pair-wise Cohen  inter-rater agreement [24] of all three raters, resulting
in a value of 0.739, as well as the Fleiss   agreement [25], resulting in a value of 0.728. These values,
ranging in [−1, +1], both indicate a substantial level of agreement among the annotators. We then
6In MIMICAUSE, Enable(e1,e2) means that the emergence, application or increase of e1 leads to the emergence or increase of
e2 “jointly to a set of other contributing factors”.
7Drug Reviews: https://archive.ics.uci.edu/dataset/461/drug+review+dataset+druglib+com
8https://archive.ics.uci.edu/</p>
        </sec>
      </sec>
      <sec id="sec-3-7">
        <title>Metric</title>
        <p>Cohen 
Fleiss  
Precision</p>
      </sec>
      <sec id="sec-3-8">
        <title>Cause</title>
        <p>0.706
0.707
applied a majority vote among the three annotators for the 200 samples of causal relations from our
model thus forming a small gold standard. Table 3 summarizes the results categorized by type of
relation, presenting also the average pair-wise Cohen’s  inter-rater agreement, Fleiss’   agreement,
and the precision score achieved for each of the relations within the gold standard.</p>
        <p>The achieved overall precision is 0.73. If we disregard the directionality of the extracted relations, the
precision slightly increases to 0.76. In both cases, the level of precision is quite satisfactory, as it closely
aligns with the algorithm’s overall performance on the original MIMICause test dataset, for which it
was specifically trained, proving the robustness and generalization capabilities of our model.</p>
        <p>The raters found annotating the Enable and Hinder relations more challenging, resulting in slightly
lower agreement and precision scores (both 0.60). This observation aligns with the performance analysis
on MIMICause in Section 3, where these two classes achieved slightly lower F1 scores compared to the
others.
3.2. Knowledge Graph
We deploy the fine-tuned JSL-MedLama-3-8B-v2.0 on the 19,200 instances subset of Drug Reviews
and generate a causal drugs knowledge graph (referred to as CausalDrugsKG), comprising 19,200
triples. Out of them, roughly 3,000 are distinct (non-reified) triples, connecting 1,149 unique Drug
entities and 322 unique Condition entities via the five considered causal relation categories, i.e. Cause,
Enable, Prevent, Hinder and Other. In the corresponding ontology, designed to describe CausalDrugsKG
(causaldrugskg-ont namespace prefix), each extracted claim is successively reified into instances of the
causaldrugskg-ont:Statement class, with causaldrugskg-ont:Statement representing a specific
assertion derived from a collection of drug review items. A sample of generated (un-reified) statements
is illustrated in Table 4, together with their support. Here, the support is the number of reviews where
full triples were matched).</p>
        <p>We made publicly available9 the automatically generated CausalDrugsKG graph in Turtle and RDF
serialization format in the European Data portal10. The direct link is: https://jeodpp.jrc.ec.europa.eu/
ftp/jrc-opendata/ETOHA/ETOHA-OPEN/CausalDrugsKG.ttl.</p>
        <p>As an illustration of how CausalDrugsKG can be queried for retrieving analytical information on target
entities, Figure 1 shows a sample SPARQL query that returns all the statements having the target Drug
causaldrugskg:flecainide as subject, where causaldrugskg:flecainide is the knowledge graph
entry for the popular antiarrhythmics medication. Figure 2 shows the 10 most frequently occurring
Drug and Condition entities in the CausalDrugsKG graph, with over 15% of the extracted triples (out of
the 19,200) having birth control as Condition, followed by pain, depression and anxiety.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions</title>
      <p>In this work, we employed JSL-MedLlama, a decoder-only LLM for extracting causal relationships
from drug reviews, leveraging instruction fine-tuning to enhance its performance. We compared its</p>
      <sec id="sec-4-1">
        <title>9Under Creative Commons Attribution 4.0 International (CC BY 4.0). 10https://data.jrc.ec.europa.eu/dataset/acebeb4e-9789-4b5c-97ec-292ce14e75d0</title>
        <p>PREFIX causaldrugskg: &lt;http://causaldrugskg.org/causaldrugskg/resource/&gt;
PREFIX causaldrugskg-ont: &lt;http://causaldrugskg.org/causaldrugskg/ontology#&gt;
SELECT ?statement
FROM &lt;CausalDrugsKG&gt;
WHERE { ?statement a rdf:Statement .</p>
        <p>?statement rdf:subject causaldrugskg:flecainide . }
performance against encoder-based baselines using the MIMICause dataset showing how the fine-tuned
model achieves superior results in the classification task.</p>
        <p>To assess the robustness and cross-domain generalization of our approach, we applied our fine-tuned
model to the Drug Reviews dataset, generating CausalDrugsKG, a knowledge graph of 3,050 unique
triples linking 1,149 drugs to 322 conditions through the five considered causal relation types. The
conducted expert annotation on a subset of extracted causal relationships confirmed the accuracy and
reliability of the model, reinforcing its applicability to real-world biomedical scenarios.</p>
        <p>The results highlight the critical role of LLMs in advancing causal reasoning in the biomedical domain
and demonstrate their potential to generate structured knowledge from unstructured patient narratives.
To support future research, we publicly release the fine-tuned model as well CausalDrugsKG, providing
valuable resources for further advancements in biomedical AI.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We would like to thank the colleagues of the Digital Health Unit (JRC.F7) at the Joint Research Centre
of the European Commission for helpful guidance and support. The views expressed are purely those of
the authors and may not in any circumstance be regarded as stating an oficial position of the European
Commission. We acknowledge financial support under the National Recovery and Resilience Plan
(NRRP), Mission 4 Component 2 Investment 1.5 - Call for tender No.3277 published on December
30, 2021 by the Italian Ministry of University and Research (MUR) funded by the European Union –
NextGenerationEU. Project Code ECS0000038 – Project Title eINS Ecosystem of Innovation for Next
Generation Sardinia – CUP F53C22000430001- Grant Assignment Decree No. 1056 adopted on June
23, 2022 by the Italian Ministry of University and Research (MUR). We also acknowledge the financial
support of the project “Data Mesh Platform Builder with AI (DAMPAI)”, funded under the “Fondo per
la crescita sostenibile” by the “Ministero delle Imprese e del Made in Italy”.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT, Grammarly in order to: Grammar and
spelling check. After using this tool/service, the author(s) reviewed and edited the content as needed
and take(s) full responsibility for the publication’s content.
adaptation of large language model rescoring for parameter-eficient speech recognition, in: 2023
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), IEEE, 2023, p. 1–8.
doi:10.1109/asru57964.2023.10389632.
[24] M. L. McHugh, Interrater reliability: The kappa statistic, Biochemia Medica 22 (2012) 276 – 282.</p>
      <p>doi:10.11613/bm.2012.031.
[25] R. Falotico, P. Quatto, Fleiss’ kappa statistic without paradoxes, Quality and Quantity 49 (2015)
463 – 470. doi:10.1007/s11135-014-0003-1.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Shimizu</surname>
          </string-name>
          , S. Kawano, Special issue:
          <article-title>Recent developments in causal inference and machine learning</article-title>
          ,
          <source>Behaviormetrika</source>
          <volume>49</volume>
          (
          <year>2022</year>
          )
          <fpage>275</fpage>
          -
          <lpage>276</lpage>
          . doi:
          <volume>10</volume>
          .1007/s41237- 022- 00173- z.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Akkasi</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-F. Moens</surname>
          </string-name>
          ,
          <article-title>Causal relationship extraction from biomedical text using deep neural models: A comprehensive survey</article-title>
          ,
          <source>Journal of Biomedical Informatics</source>
          <volume>119</volume>
          (
          <year>2021</year>
          )
          <article-title>103820</article-title>
          . doi:https: //doi.org/10.1016/j.jbi.
          <year>2021</year>
          .
          <volume>103820</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schölkopf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Faltings</surname>
          </string-name>
          ,
          <article-title>The odyssey of commonsense causality: From foundational benchmarks to cutting-edge reasoning</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2406</volume>
          .
          <fpage>19307</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>X.</given-names>
            <surname>Shen</surname>
          </string-name>
          , S. Ma, P. Vemuri,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Castro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Caraballo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. J.</given-names>
            <surname>Simon</surname>
          </string-name>
          ,
          <article-title>A novel method for causal structure discovery from ehr data and its application to type-2 diabetes mellitus</article-title>
          ,
          <source>Scientific Reports</source>
          <volume>11</volume>
          (
          <year>2021</year>
          ).
          <source>doi:10.1038/s41598- 021- 99990- 7.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Mozer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Kaufman</surname>
          </string-name>
          , L. A.
          <string-name>
            <surname>Celi</surname>
          </string-name>
          , L. Miratrix,
          <article-title>Leveraging text data for causal inference using electronic health records</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2307</volume>
          .
          <fpage>03687</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Fernainy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. E.</surname>
          </string-name>
          et al.,
          <article-title>Rethinking the pros and cons of randomized controlled trials and observational studies in the era of big data and advanced methods: A panel discussion</article-title>
          ,
          <source>BMC Proc 18 (Suppl 2)</source>
          (
          <year>2024</year>
          ).
          <source>doi:10.1186/s12919- 023- 00285- 8.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Yadav</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Saha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ekbal</surname>
          </string-name>
          ,
          <article-title>Relation extraction from biomedical and clinical text: Unified multitask learning framework</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2009</year>
          .09509.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H.</given-names>
            <surname>Gurulingappa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Rajput</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fluck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hofmann-Apitius</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Toldo</surname>
          </string-name>
          ,
          <article-title>Development of a benchmark corpus to support the automatic extraction of drug-related adverse efects from</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>