A framework for representing clinical research in
                                 FHIR

                                  Hugo Leroux1,5[0000−0002−2033−8178] , Christine K
                             2,5[0000−0002−7180−6867]
                   Denney                           , Smita Hastak3,5[] , and Hugh Glover4,5[]
              1
                  The Australian E-Health Research Centre, CSIRO, Brisbane QLD 4029, Australia
                                               hugo.leroux@csiro.au
                  2
                    Eli Lilly and Company, Lilly Corporate Center, Indianapolis IN 46285, U.S.A.
                                                christi d@lilly.com
                                   3
                                      Samvit Solutions, Reston VA 20190, U.S.A.
                                          shastak@samvit-solutions.com
                         4
                            Blue Wave Informatics LLP, Exeter, EX4 5AH, United Kingdom
                                     hugh glover@bluewaveinformatics.co.uk
                             5
                               HL7 Biomedical Research and Regulations Working Group
                                      https://confluence.hl7.org/display/BRR/


                      Abstract. The benefits of clinical research have been widely acknowl-
                      edged. However, clinical research is often costly, time-consuming, and
                      burdensome to both the participants and researchers. There has recently
                      been much emphasis on the need to streamline how clinical research is
                      conducted and maximise the benefits of research through the sharing of
                      research data and methods. In this paper, we explore the suitability of
                      the Health Level 7 FHIR standard for representing and managing clin-
                      ical research. While FHIR has gained popularity within patient care,
                      the development of FHIR models and solutions to facilitate the deliv-
                      ery of clinical research is still in the early stages of maturity. This work
                      outlines the activities of the HL7 Biomedical Research and Regulations
                      FHIR working group in developing FHIR-based models and solutions for
                      designing and conducting clinical research more effectively. Our goal is
                      to ascertain whether a native, FHIR-based, API definition is suitable
                      for clinical research, can alleviate the issues relating to both the dis-
                      coverability and accessibility of clinical research data, and enable the
                      semantic interoperability of the data that can lead to the reusability of
                      the datasets. We outline how the FHIR resources have the potential to
                      overcome the challenges of sharing and reusing clinical research data. We
                      discuss some of the current limitations associated with those resources
                      and how we are working to address them. Our overarching goal for this
                      work is to stimulate a robust discussion on how clinical research seman-
                      tics and data exchange use cases could be represented in FHIR.

                      Keywords: Clinical Research · Data Sharing · FHIR · Data model ·
                      FAIR.


Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
2       H. Leroux et al.

1   Introduction
Clinical research is an integral and important part of healthcare delivery. A re-
port on the economic benefit of clinical research data sharing in Australia has
found that the Australian government invests $1.5 billion in health research and
development annually [1]. It estimates that the value to the Australian Gross Do-
mestic Product could exceed $129 million annually if data from publicly-funded
clinical research was made accessible to the research community [1]. However,
the effectiveness of clinical research relies on its ability to have an impact on
health [2].
    Clinical research is costly, time-consuming and taxing both on the clinical
researchers and on the participants. There has been much emphasis, lately, on the
need to streamline the way in which clinical research is conducted to maximise
the benefit [2, 3]. There has also been a push for clinical research data and
methods to be shared more broadly. Warren [4] stated that ‘data sharing may
help reduce costs by allowing researchers to avoid duplicating trials or to answer
questions without undertaking a separate data collection effort’.
    A common theme across data sharing initiatives is the ‘idea of building in-
frastructures based on rich metadata’ that will ‘support their optimal re-use’ [5].
Mons et al. [5] stated that ensuring that all resources are findable, accessible,
interoperable and reusable (FAIR), ‘requires widely shared and adopted stan-
dards and principles’. FAIR refers to a set of principles focused on ensuring that
research objects are reusable, will be leveraged, and become as valuable as possi-
ble [5]. There has been an increasing focus on reproducibility and replicability of
clinical research [2] resulting from findings that over 70% of published research
cannot be reproduced by others [6].
    The Fast Healthcare Interoperability Resources (FHIR) framework is an emerg-
ing standard that is geared towards the communication of clinical data using
HL7 messaging protocols and, when supported by a rich information model, can
achieve the semantic interoperability of clinical data. As FHIR is gaining impor-
tance within the healthcare and life sciences community [7] and has been swiftly
adopted by the major healthcare providers (including Cerner [8] and Epic [9]),
FHIR is likely to play a significant role in the future of healthcare and clinical
research [10]. Furthermore, the National Institutes of Health have issued a no-
tice to ‘explore the use of the FHIR standard to capture, integrate, and exchange
clinical data for research purposes and to enhance capabilities to share research
data’ [11]. The current effort in FHIR resource development is primarily focussed
on patient care and geared towards electronic health records (EHRs) and hospi-
tal billing and accountancy systems. Developing FHIR models and solutions to
facilitate the delivery of clinical research is still in the early stages of maturity,
notwithstanding some early efforts [12, 13].
    Our overarching goal in this project is to ascertain whether a native, FHIR-
based, data model is suitable for clinical research, can alleviate the issues relating
to both the discoverability and accessibility of clinical research data and, enable
the semantic interoperability of the data that can lead to the reusability of the
data sets. In addition, we believe that the adoption of the FHIR standard for
                     A framework for representing clinical research in FHIR       3

developing clinical research protocols and capturing clinical research data can
also help preserve the integrity of the data and the privacy of individuals through
the adoption of profiling to constrain the content exposed by the resource.
    In the next section, we elaborate on the considerations for representing clin-
ical research in FHIR. We outline the activities of the HL7 Biomedical Research
and Regulations (BR&R) FHIR working group (WG) in developing FHIR-based
resources for designing and conducting clinical research more effectively. We dis-
cuss how the FHIR resources have the potential to overcome the challenges of
sharing and reusing clinical research. We then discuss some of the current limita-
tions associated with the FHIR resources and how we are working to addressing
them.


2   Representing Clinical Research in FHIR

The core components in FHIR are resources, which are logical constructs in
healthcare and define both behaviour and meaning. The resources are scoped to
the most commonly known data exchange implementation needs and collectively
form and support the complex health systems. Extensions are a mechanism
provided by FHIR to allow support for the less common or outlier use cases
of data exchange whose requirements are not in the scope of the base resource
definition. To ensure interoperability, FHIR also enables the creation of profiles
that can be used to constrain the structure of the resource, using some rules
defined by the profiler, to ensure compliance by the implementation systems.
The standardisation of the methods is achieved by defining a set of common
functionality within the resources while the standardisation of the data semantics
is facilitated by allowing and occasionally enforcing the definition of code systems
and value sets that describe the data.
     The HL7 BR&R FHIR WG has been established to facilitate the develop-
ment of common standards and the management of research-focussed domain
analysis models for clinical research information management. BR&R also seeks
to assure that related or supportive standards produced by other HL7 groups
are robust enough to accommodate their use in regulated clinical research. A
shared semantic view is essential if the clinical research community is to achieve
computable semantic interoperability. In this regard, the BR&R and Clinical De-
cision Support FHIR WGs have developed a small number of resources (namely
ResearchSubject, ResearchStudy, PlanDefinition and ActivityDefinition) for de-
scribing clinical research study design in FHIR. These four resources are still
in early stages of design and therefore are at low levels of maturity. The data
are expected to be captured using existing FHIR resources such as Encounter,
Procedure and Observation to name just three. This is anticipated to expedite
the sharing of clinical research data in the future.
     Sharing of clinical research data has numerous challenges relating to the
discoverability of the clinical research undertaken, the availability of the data
sets and associated methods in a machine-readable, structured and standardised
manner, and the adoption of common standards. Addressing these challenges
4       H. Leroux et al.

could produce results and methods that are more easily understandable and
facilitate the reproducibility and replicability of the results. We introduce the
aforementioned resources below and elaborate on how they address the challenges
associated with sharing clinical research data.

2.1   FHIR Resources for Clinical Research
ResearchStudy. The ResearchStudy resource provides a template for the def-
inition of the overall structure of a study or trial, including the protocol and the
various arms comprising the study. It provides references to the PlanDefinition
resource to allow the user to define the protocol for the study; to the Organization
resource to define the sponsor; to a Practitioner resource to define the prin-
cipal investigator; and to a Location resource to facilitate the description of
a study’s site physical property. Other study characteristics, such as the study
identifier, the title, the description, and the category of study can be defined
within the core resource.

ResearchSubject. The ResearchSubject resource facilitates the definition of
a participant to the study. It provides two mandatory references: one to the
ResearchStudy and the other to the Patient resource. The purpose of the
latter is to link an actual patient to the role of participant in the study. Further-
more, it provides a reference to the Consent resource to facilitate the participant
consenting to participate in the study.

PlanDefinition. A PlanDefinition is a pre-defined group of actions to be
taken in particular circumstances, often including conditional elements, options,
and other decision points. The resource is flexible enough to be used to rep-
resent a variety of workflows, as well as clinical decision support and quality
improvement assets, including order sets, protocols, and decision support rules.
Although this resource currently does not fully support the clinical research use
cases, it has a good foundation to be leveraged for defining the protocol in rela-
tion to the complex schedule of activities, objectives, and outcomes. HL7 BR&R
WG members are currently evaluating this resource to identify and map the
protocol concepts, identify gaps, provide updates to definitions, and possibly
consider developing extensions and eventually a Clinical Research or Protocol
FHIR Profile.

ActivityDefinition. An ActivityDefinition is a shareable, consumable de-
scription of some activity to be performed. It may be used to specify actions to
be taken as part of a workflow, order set, or protocol, or it may be used inde-
pendently as part of a catalog of activities, such as orderables. Within clinical
research, this resource would define all the activities that are defined in a proto-
col. This may include administrative activities such as checking eligibility, trial
enrolment, obtaining consent, and capturing the various clinical activities such
as blood collection, urine analysis, etc.
                       A framework for representing clinical research in FHIR   5

2.2   Information Model

Figure 1 illustrates a set (or network) of HL7 FHIR Resources and their re-
lationships that are relevant to a clinical research use case of a Patient in
a role of ResearchSubject who is a participant in a ResearchStudy that is
sponsored by an Organization, being conducted at a particular Location.
The ResearchStudy is being executed based on the protocol definition in a
PlanDefinition, which includes ActivityDefinitions. The ActivityDefinition
describes a CarePlan for each participant that further defines Appointments,
which lead to Encounters that produce Observations that relate to a particu-
lar patient within the study.


                                 Organization


                                   Location


       ResearchStudy            PlanDefinition          ActivityDefinition


                                                            CarePlan
      ResearchSubject


                                                          Appointment


          Patient                                          Encounter
                                Practitioner

                                Observation

              Fig. 1. FHIR Information Model for Clinical Research.


3     Overcoming the Challenges of Sharing and Reuse

We believe that the information model described previously should overcome the
challenges of sharing and reusing clinical research data. The BR&R WG, along
with other working groups, have engaged in a number of initiatives to promote
clinical research data sharing and reuse. We describe two related focus areas
below.
6      H. Leroux et al.

3.1   Activities within BR&R in promoting data sharing and reuse

BRIDG Mapping. The Biomedical Research Integrated Domain Group (BRIDG)
Model [14] is a collaborative effort engaging stakeholders from the Clinical Data
Interchange Standards Consortium (CDISC), the HL7 BRIDG Work Group, the
International Organization for Standardization (ISO), the US National Cancer
Institute (NCI), and the US Food and Drug Administration (FDA). The goal
of the BRIDG Model is to produce a shared view of the dynamic and static se-
mantics for the domain of basic, pre-clinical, clinical, and translational research
and its associated regulatory artefacts.
    The BRIDG model is supported by the HL7 BR&R WG as its domain infor-
mation model and is intended to provide the semantic foundation to the artefacts
developed by BR&R. It is a conceptual model, although parts of the model are
quite granular and therefore often considered a hybrid of conceptual and log-
ical layers. BR&R WG members are leveraging the BRIDG model concepts,
definitions, and relationships to inform FHIR resource models.


CDISC Lab Semantics in FHIR. The BR&R WG and the Orders and Ob-
servations (O&O) WG cosponsored the development of an implementation guide
[15] to provide direction for sites and sponsors seeking to exchange laboratory
data via FHIR (Note: scope is limited to the data collected to evaluate safety of
an interventional study medication).


3.2   The Availability of Machine-Readable Clinical Research
      Definition

The BR&R WG is exploring ways to make the clinical research study protocol
available in a machine-readable manner. A structured and computable protocol
is important for clinical research, yet the challenge has not been fully addressed
by prior initiatives. The CDISC Protocol Representational Model (PRM) [16] is a
UML-based standard that developed a set of standard protocol concepts and was
intended to be used alongside the other CDISC and HL7 standards. PRM has
now been integrated within the CDISC Controlled Terminology Package (CT)
[17]. The main drawback of the CDISC PRM and CT standards is that they do
not adhere to commonly used clinical terminology standards, such as SNOMED
CT or LOINC, which makes semantic interoperability of the protocol difficult to
achieve. Furthermore, the CDISC PRM has had limited adoption by the clinical
research community [18]. Another initiative, SPIRIT [19], provides a checklist
for a 33-item trial protocol to be entered electronically. SPIRIT currently does
not allow for coded input and only allows the protocol to be entered in free-text.
Furthermore, it does not allow the protocol to be linked to either a controlled
clinical vocabulary, such as SNOMED CT, nor to any publications discussing
the study.
    The desired future state would draw upon the existing initiatives and design
a standardised, structured and computable representation of the study protocol
                     A framework for representing clinical research in FHIR     7

as a set of FHIR artefacts (resources, profiles, extensions) that define the ap-
proved protocol. This should enable the validation of the study data against the
clinical research questions defined within the protocol and what was scheduled
and performed during the study.


3.3   Adopting a Common Standard

There have been a number of initiatives lately to standardise the data for use
both in healthcare and clinical research. The HL7 WG on Semantic Interoper-
ability [20] is engaged in developing models and use cases in facilitating the use
of RDF as a common semantic foundation for healthcare information interop-
erability. One of their key deliverables has been the development of the FHIR
RDF representation and ontology [21]. FHIR RDF might prove useful for im-
plementing our model due to the complexity of representing the bi-directional
nature of clinical research. Another challenge is the gap between patient care
and clinical research data standards. Aerts [22] hints at the convergence of the
CDISC and HL7 standards to bridge that gap. Furthermore, there is a global
effort to standardise clinical research data [2] to translate it into meaningful
discovery and improve benefits to patients. Indeed, the CDISC Lab semantics
in FHIR project [15] suggests that there is currently no standard in place to
provide data to sponsors as they adhere to data standards within the CDISC
suite of standards, whereas healthcare is progressively adopting the FHIR stan-
dard for communication and distribution [23, 24]. The need to standardise data
is vital if we want to achieve meaningful use and semantic interoperability of
clinical research data [12]


4     Discussion

Developing a framework for representing clinical research in FHIR is challeng-
ing but provides an important opportunity for change. We have the potential
and responsibility to guide the next generation of clinical research through our
engagement with researchers, sponsors, regulatory agencies, and industry. We
present our thoughts below in helping to shape this important engagement.


4.1   Resource Context and Workflow

In many traditional domain models [14] for clinical research, the entities may
be used in varying contexts and change state over time. For example, the visit
concept may represent both a planned activity and the resulting performed ac-
tivity. Over time, the attributes and status of the entity adapt to the process.
Conversely, while not a limitation, FHIR differentiates resources by their role
in the workflow process and provides separate resources for the template of the
visit (ActivityDefinition), the designated template of the visit for a particu-
lar participant (CarePlan), the scheduled visit (Appointment), and the occurred
8      H. Leroux et al.

visit (Encounter). In order to correctly identify the desired resource, the imple-
menter must understand the intended use of the resource and how to traverse
the workflow in which it resides.

4.2   Adherence to an information model or foundation ontology
Resources in FHIR are built in a pragmatic manner to facilitate their rapid
implementation. Consequently, the aforementioned resources are not based on
a foundation ontology, such as the Semanticscience Integrated Ontology
(SIO) [25] or the Ontology for Biomedical Investigations (OBI) [26]. It
is also a common misconception that the FHIR resources equate to an informa-
tion model. An information model such as BRIDG or CDISC PRM provides the
concepts, meaning, and relationships between the concepts of a given domain
of interest. These models can be used to inform the design of implementation-
oriented FHIR artefacts. Our model, however, could benefit from some judicious
mapping to the SIO and OBI ontologies. We seek to engage with the Semantic
Web and Life Sciences community to help us facilitate this mapping.

4.3   Linkages to Clinical Research Resources
Clinical research, generally, cannot have visibility to the Patient, only to the
ResearchSubject, to support de-identification and privacy of participants. How-
ever, creation of a ResearchSubject instance, in particular, requires a reference
to the Patient resource. Consequently, a dummy Patient resource needs to be
created to play the role of participant in the study. Furthermore, many of the
FHIR resources, such as Observations, Procedures and Diagnosis, provide
a mandatory reference to a Patient but not to the ResearchSubject. While
not technically a limitation, it adds another level of complexity for traversing
the model. The CDISC Lab Semantics in FHIR Implementation Guide provides
some guidance to the sites on how to mask the patient identity.

4.4   Model Maturity
In assessing the maturity of the FHIR resources for use in clinical research,
we see potential for enhancements to the currently defined ResearchStudy and
ResearchSubject resources from BR&R as well as to many other resources
that were defined with only clinical use cases - resources such as Observation,
Procedure, etc. At present, the ResearchStudy resource contains attributes
designed to capture a text description of the arms for the study. However, this
information is defined during protocol development and therefore, it may be
better to design the concept of arm as part of the PlanDefinition resource and
remove it from the ResearchStudy resource. BR&R is currently discussing this
change on the ResearchStudy resource. ResearchStudy also links to a Location
to represent the study site overseeing a set of ResearchSubjects. However, this
falls short of representing the full context of the study site, such as the site
personnel and study participants assigned to a site.
                      A framework for representing clinical research in FHIR          9

4.5   Traversing the Model

By their very nature, FHIR resources introduce complex data types and relation-
ships. This, coupled with the adoption of the Representational State Transfer
(REST) framework, means that traversing a network of resources, as depicted
in Figure 1, necessitates moving beyond the lens of a traditional relational de-
sign, which starts at a point and moves in a single direction, to looking at
the model as an ontology of nodes. One such example is collating the partic-
ipants in a research study. The ResearchStudy resource does not contain a
reference to the ResearchSubject enrolled in the study. Rather, the reference
is contained within the ResearchSubject resource. Similarly, when trying to
elucidate an Observation related to a ResearchStudy, one has to traverse the
ResearchStudy to obtain the relevant context, then one needs to work one’s
way back from the Observation to the Encounter to ultimately link the rele-
vant observation to the ResearchSubject via the Patient resource.


5     Conclusion

There is an increasing need to streamline how clinical research is conducted and
maximise the benefits of research through sharing of research data and methods.
This work has explored the suitability of the HL7 FHIR standard to represent
and manage clinical research. We have outlined the activities of the HL7 Biomed-
ical Research & Regulations working group in developing FHIR-based models
and solutions to design and conduct clinical research more effectively. We have
proposed an information model comprising the FHIR resources to semantically
represent the clinical research lifecycle, so as to facilitate semantic interoper-
ability and increased sharing of the data. There have been a number of distinct
standards proposed recently for representing clinical research. Our goal, for this
work, is to stimulate a robust discussion on how clinical research semantics and
data exchange use cases can be represented in FHIR.


References
 1. CIE:      The       public      benefit    of      collaborative      access      to
    publicly      funded       clinical     and        health      studies       (2019),
    https://discovery.csiro.au/permalink/f/12s7o4e/CSIRO1196669610001981
 2. Jauregui, B., Hudson, L.D., Becnel, L.B., et al.: Global standardization of clinical
    research data. Applied Clinical Trials 28(4), 18–24 (2019)
 3. Kush, R.D., Nordo, A.H.: Data Sharing and Reuse of Health Data for Research,
    pp. 379–401. Springer International Publishing (2019)
 4. Warren, E.: Strengthening research through data sharing. New England Journal
    of Medicine 375(5), 401–403 (2016)
 5. Mons, B., Neylon, C., Velterop, J., Dumontier, M., da Silva Santos, L.O.B., Wilkin-
    son, M.D.: Cloudy, increasingly FAIR; revisiting the FAIR Data guiding principles
    for the European Open Science Cloud. Information Services & Use 37(1), 49–56
    (2017)
10     H. Leroux et al.

 6. Baker, M.: 1,500 scientists lift the lid on reproducibility. Nature News 533(7604),
    452 (2016)
 7. HL7 FHIR: Argonaut project (2018), http://argonautwiki.hl7.org
 8. Miliard, M.: Cerner touts adoption of normative fhir r4 standard (2018),
    https://www.healthcareitnews.com/news/cerner-touts-adoption-normative-fhir-
    r4-standard
 9. Epic: Open epic (2018), https://open.epic.com/Interface/FHIR
10. Posnack, S., Barker, W.: Heat wave: The u.s. is poised to catch fhir in
    2019 (2018), https://www.healthit.gov/buzz-blog/interoperability/heat-wave-the-
    u-s-is-poised-to-catch-fhir-in-2019
11. NIH: Fast healthcare interoperability resources (fhir R ) standard (2019),
    https://grants.nih.gov/grants/guide/notice-files/NOT-OD-19-122.html
12. Leroux, H., Metke-Jimenez, A., Lawley, M.J.: Towards achieving semantic interop-
    erability of clinical study data with FHIR. Journal of biomedical semantics 8(1),
    41 (2017)
13. Leroux, H., Metke-Jimenez, A., Lawley, M.J.: ODM on FHIR: Towards achieving
    semantic interoperability of clinical study data. In: SWAT4LS. CEUR (2015)
14. NCI: About BRIDG model (2016), https://bridgmodel.nci.nih.gov/about-bridg
15. HL7 BR&R: CDISC Lab Semantics in FHIR Implementation Guide (2019),
    http://hl7.org/fhir/uv/cdisc-lab/2019Sep/
16. CDISC: Protocol Representation Model (2010), http://www.cdisc.org/protocol
17. CDISC: Controlled terminology (2019), https://www.cdisc.org/standards/terminology
18. Huser, V., Sastry, C., Breymaier, M., Idriss, A., Cimino, J.J.: Standardizing data
    exchange for clinical research protocols and case report forms: An assessment of
    the suitability of the Clinical Data Interchange Standards Consortium Operational
    Data Model. Journal of Biomedical Informatics 57, 88–99 (2015)
19. Chan, A.W., Tetzlaff, J.M., Altman, D.G., Laupacis, A., et al.: SPIRIT 2013 State-
    ment: Defining Standard Protocol Items for Clinical Trials. Annals of Internal
    Medicine 158(3), 200–207 (2013)
20. HL7:          RDF          for        Semantic        Interoperability      (2016),
    http://wiki.hl7.org/index.php?title=RDF for Semantic Interoperability
21. Solbrig, H., Prud’hommeaux, E., Jiang, G.: Blending FHIR RDF and OWL. In:
    SWAT4LS. vol. 2042. CEUR (2017)
22. Aerts, J.: Towards a single data exchange standard for use in healthcare and in
    clinical research. Studies in health technology and informatics 248, 55–63 (2018)
23. Siwicki, B.: How FHIR 4 will drive interoperability progress in health-
    care (April 2019), https://www.healthcareitnews.com/news/how-fhir-4-will-drive-
    interoperability-progress-healthcare
24. Borfitz,      D.:       Imagining       a      world       on     FHIR      (2019),
    https://www.clinicalinformaticsnews.com/2019/05/02/imagining-a-world-on-
    fhir.aspx
25. Dumontier, M., Baker, C.J., Baran, J., et al.: The semanticscience integrated on-
    tology for biomedical research and knowledge discovery. Journal of biomedical se-
    mantics 5(1), 14 (2014)
26. Bandrowski, A., Brinkman, R., Brochhausen, M., et al.: The ontology for biomed-
    ical investigations. PloS one 11(4) (2016)