<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A goal-oriented method for FAIRification planning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>César H. Bernabé</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tiago Prince Sales</string-name>
          <email>t.princesales@utwente.nl</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Erik Schultes</string-name>
          <email>eriks@gofair.foundation</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Niek van Ulzen</string-name>
          <email>niek.van.ulzen@knmi.nl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Annika Jacobsen</string-name>
          <email>a.jacobsen@lumc.nl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luiz Olavo Bonino da Silva</string-name>
          <email>l.o.boninodasilvasantos@utwente.nl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Barend Mons</string-name>
          <email>b.mons@lumc.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Roos</string-name>
          <email>m.roos@lumc.nl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>GO FAIR Foundation</institution>
          ,
          <addr-line>Leiden</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Leiden University Medical Centre</institution>
          ,
          <addr-line>Leiden</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>R&amp;D Observations and Data technology, Royal Netherlands Meteorological Institute (KNMI)</institution>
          ,
          <addr-line>De Bilt</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Semantics, Cybersecurity &amp; Services, University of Twente</institution>
          ,
          <addr-line>Enschede</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>The Leiden Academic Center for Drug Research</institution>
          ,
          <addr-line>Leiden</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The FAIR Principles provide guidance on how to improve the Findability, Accessibility, Interoperability, and Reusability of digital resources. Since the publication of the principles in 2016, several workflows have been proposed to support the process of making data FAIR (FAIRification). However, to respect the uniqueness of diferent communities, both the principles and the available workflows have been deliberately designed to remain agnostic in terms of standards, tools, and related implementation choices. Consequently, FAIRification needs to be properly planned in advance, and implementation details must be discussed with stakeholders and aligned with FAIRification objectives. To support this, this paper describes a method for identifying and refining FAIRification objectives. Leveraging on best practices and techniques from requirements and ontology engineering, the method aims at incrementally elaborating the most obvious aspects of the domain (e.g. the initial set of elements to be collected) into complex and comprehensive objectives. The definition of clear objectives enables stakeholders to communicate efectively and make informed implementation decisions, such as defining achievement criteria for distinct principles and identifying relevant metadata to be collected.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;FAIR</kwd>
        <kwd>FAIRification</kwd>
        <kwd>FAIRification objectives</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The vast amount of data generated every day is only valuable if it can be properly interpreted
and reused. However, it is humanly unfeasible to manually merge and make sense of all
the information currently available, therefore the support of machines is required. Although
machines can automatically analyse and interpret data to eficiently find useful information,
they still require time-consuming human support to prepare and merge data [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. To address this,
the FAIR principles have been proposed to guide the transformation and production of resources
that are Findable, Accessible, Interoperable and Reusable by humans and machines [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. FAIR
resources can be easily managed by machines with minimal human intervention, thus reducing
human workload.
      </p>
      <p>
        The four letters of FAIR are further decomposed into 15 principles [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Findability is
enforced by using globally unique and persistent identifiers to refer to data and metadata (F1),
describing data with rich metadata (F2), explicitly associating metadata with data (F3), and
indexing metadata in searchable resources (F4). Accessibility is achieved by using standardised,
open communication protocols for data exchange (A1, A1.1) that allow access authorisation
procedures (A1.2) while ensuring the longevity of metadata (A2). Interoperability is enhanced
by publishing metadata and data in broadly applicable knowledge representation languages (I1),
reusing vocabularies that also follow the FAIR principles (I2), and including qualified references
to other metadata and data (I3). Finally, reusability is facilitated by describing metadata and data
with accurate and relevant attributes (R1), including usage licences (R1.1), detailed provenance
(R1.2) and using domain-relevant community standards (R1.3).
      </p>
      <p>Data that is made FAIR (FAIRified data) has significant value in many areas. One such area
is rare diseases, where projects such as the European Joint Programme on Rare Diseases (EJP
RD) 1 interoperate FAIR data and metadata from diferent institutions for the benefit of rare
disease research. Without FAIR, this inherently siloed and dispersed knowledge would be of
reduced value, as it would not be large enough to answer research questions on its own.</p>
      <p>
        The process of making data FAIR (‘FAIRification’) is organised in steps by FAIRification
workflows (e.g., [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ]). Nonetheless, neither the FAIR principles nor the FAIRification
worklfows mandate the use of any specific standard, format or software. This is because FAIR and
FAIRification have been made agnostic to respect the unique requirements and needs that
diferent communities face when managing and sharing data. Therefore, FAIR can be
implemented in diferent manners and at diferent levels. However, this flexibility requires careful
guidance throughout the FAIRification process to ensure that the implementation decisions
(e.g., standards, metadata) align with the FAIRification objectives. In fact, the identification of
FAIRification objectives is the initial and crucial step of several FAIRification workflows [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        The task of properly identifying goals and requirements has been studied by the requirements
engineering community from a software development perspective (e.g., [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]). Literature in
the area informs that the lack of proper planning and refinement of goals and requirements
has a significant impact on the software development process. For instance, Pressman [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
points out that changing requirements after the software product has been delivered can
cost up to 60 to 100 times more than changing a requirement during the software planning
phase. We hypothesise that inadequate identification of FAIRification objectives may have a
similar impact on planning and executing a FAIRification process. However, there is a lack
of research on methods specifically focused on supporting FAIRification planning via the
identification and refinement of FAIRification objectives. Furthermore, a recent study on the
challenges of FAIRification concluded that clarifying goals prior to implementation is a key step
in FAIRification, as it helps the team to make decisions that are consistent with its objectives [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        To address the aforementioned gap we developed GO-Plan (Goal-Oriented FAIRification
Planning), a method to plan FAIRification through a systematic identification and refinement
of FAIRification objectives. The method reflects our understanding that distinct objectives can
have diferent impacts on the planning and execution of FAIRification. Consequently, resources
should be made FAIR at a level that aligns with the specific objectives of the FAIRification
project. That is, resources should be made “FAIR enough” to fulfil the objectives of the involved
collaborators2. Thus, the FAIRification planning should not only focus on the selection of
suitable technologies or standards, but also on prioritising the efort required to raise the FAIR
level of the targeted resources. Moreover, as FAIRification is a community-driven, aspirational
and incremental process [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], these objectives must encompass the perspectives of collaborators
directly participating in the project and also relevant external stakeholders (i.e., those who will
eventually reuse the FAIRified resource). As such, each efort undertaken to make a resource
FAIR (or more FAIR—FAIRer) for one’s own objectives will also make that resource FAIRer for
others.
      </p>
      <p>
        GO-Plan was designed based on good practices from requirements engineering (e.g.,
goaloriented approaches [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], competency questions [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and ontology engineering [
        <xref ref-type="bibr" rid="ref12">12, 13</xref>
        ]) while
embedding our experiences from FAIRification projects, including training on FAIR [ 14], and
conducting FAIRification within single [ 15] and among multiple institutions [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Additionally,
the method has been optimised based on feedback obtained from a real-world application in
developing a FAIR ontology catalogue [16].
      </p>
      <p>
        The method hereby described is applicable to both post-hoc FAIRification [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], where existing
resources are made FAIR, and de novo FAIRification [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], where resources are created FAIR (e.g.,
data made FAIR upon collection).
      </p>
      <p>We discuss related works on Section 2. Then, we describe GO-Plan and illustrate it with a
ifctitious running example in Section 3. Finally, Section 4 discusses the strengths and weaknesses
of our proposal, our impressions from its real-world application, and implications for future
research. In the remainder of this paper, we use the spelling “(meta)data” to refer to both
data and metadata. The words “goal” and “objectives” are used as synonyms. Note that the
literature on FAIRification workflows usually uses the word “objective”, while the requirements
engineering literature usually uses the word “goal”.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>
        Several workflows and frameworks have been proposed to support FAIRification in diferent
ways [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The generic [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and the de novo [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] FAIRification workflows define the steps to
be followed in the FAIRification of diferent types of FAIR resources, and both describe the
identification of FAIRification objectives as the first step of FAIRification. Similarly, the FAIRplus
FAIRification framework [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] defines steps to be followed during FAIRification and a work plan
layout to support organising the FAIR implementation work. The first phase of this framework
consists of setting “realistic and practical goals” [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], with focus on defining an acceptable “FAIR
enough” state for the resource to be made FAIR. A valuable recommendation given by FAIRplus
is to avoid “the word ‘FAIR’ and its derivatives in goals entirely as it is too general to impart
2When referring to collaborators, we align with the understanding of the similar term “stakeholder”, given by [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
as individuals, groups, or organizations that afect or are afected by a given project.
clear meaning” [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        While these and other FAIRification workflows define a step for identifying FAIRification
objectives [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], to the best of our knowledge, none of them have provided detailed guidance
on defining FAIRification objectives or other FAIRification planning related aspects, such as
distinguishing between the diferent types of stakeholders involved in FAIRification projects.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. The goal-based FAIRification planning method</title>
      <p>GO-Plan aims at supporting FAIRification planning by systematically defining mature
FAIRification objectives through iterative steps. From our experience, we have found that starting with
small steps and building on them is a more feasible approach than describing objectives from
scratch. The method initially targets the most visible characteristics of the FAIRification project,
such as the project domain, scope and available resources. It then leverages them to address
more complex aspects such as relevant data concepts and competency questions. Finally, by
following this structured and incremental approach, the method guides stakeholders towards
the definition of comprehensive objectives that encompass all relevant aspects of FAIRification.</p>
      <p>GO-Plan is organised in six phases, namely (i) FAIRification preparation, (ii) assessment
of current FAIR supporting infrastructure and target resources, (iii) preparation of project
collaborators, (iv) identification of domain scope and reuse stakeholders, (v) refinement of
FAIRification goals and alignment to FAIR principles, and (vi) decision-making. These phases
are refined in several steps and described in the sections that follow.</p>
      <p>A distinction between two categories of stakeholders (collaborators) is made throughout the
phases of the method: project stakeholders and reuse stakeholders. The former refers to those
who are involved in the FAIRification project and have their own goals and requirements for
it (e.g., data custodians, patient representative). The latter refers to those who will eventually
reuse the FAIRified resource (e.g., researchers).</p>
      <p>The method should be applied from the moment when the FAIRification project has already
been idealised. For instance, when the organisation board members have already agreed on
FAIRification for a certain need. At this stage, it is assumed that some aspects, such as the
group of people that will be involved in the FAIRification project and the target resources, have
already been defined. Moreover, GO-Plan is aimed at guiding people with varying levels of
experience, from beginners to experts in FAIR and in goal-oriented elicitation of objectives.
However, people with distinct levels of experience can use the method in diferent manners. For
instance, a beginner would follow every step of the method to assure an efective identification
of FAIRification objectives. In contrast, an expert leading a FAIRification project would use
the method not only for identifying and refining the FAIRification objectives, but to also
communicate the aspects of FAIRification with the rest of the team. Additionally, researchers,
newcomers and educators can use the method as a knowledge source.</p>
      <p>GO-Plan has been used to improve the FAIRness of a catalogue for ontology-driven conceptual
modelling research, henceforth the OntoUML catalogue [17, 16], which contains a growing set
of conceptual models defined using the OntoUML modelling language [ 18] or by extending
the Unified Foundational Ontology (UFO) [ 19]. The OntoUML catalogue was initially built
using an ad hoc FAIRification workflow, as reported in [ 17]. Later, the FAIR aspects of the
catalogue were reviewed using the method presented in this paper. The reader can refer to Sales
et al. [16] for a detailed description of the method’s application. The feedback received during
the use of GO-Plan was utilised to adjust aspects related to the phases and steps phrasing (e.g.,
clarification), ordering (e.g., removing unnecessary steps) and artefacts produced (e.g., making
the list of metadata concepts explicit). Our impressions on the application of the method are
discussed in Section 4.</p>
      <p>The following subsections describe GO-Plan using a running example of a research
organisation that collects data about patients with rare diseases. This organisation has two aims: (i) to
make legacy data FAIR (i.e. post-hoc FAIRification), and to implement an Electronic Data Capture
System (EDC) that already creates FAIR data at the point of collection (i.e. de novo FAIRification).
In addition to budget and deadline, the most important requirement for this project is the
protection of patient privacy through controlled access to the data. The organisation wants to
publish non-sensitive data and metadata to foster research on rare diseases.</p>
      <sec id="sec-3-1">
        <title>3.1. Phase 1: FAIRification preparation</title>
        <p>As shown in Figure 1, the method initiates with preparation tasks that entail examining the
FAIRification project idealisation documents (e.g., grant proposals, kick-of slides, meeting
minutes) and/or holding meetings with related stakeholders (e.g., managers, IT personnel) to
identify artefacts that will support subsequent phases. The artefacts produced in all phases are
described and exemplified in Table 1.</p>
        <p>To illustrate, an analysis of the grant application for the rare diseases registry project is
conducted to identify relevant stakeholders (step 1b) and to determine the goals and requirements
of the project (steps 1a and 1e), as exemplified in Table 1. In addition, conducting interviews
with project leaders, patient representatives, and researchers can help to identify additional
goals and requirements, as well as to identify what resources need to be made FAIR (i.e., legacy
patient data and the EDC system) (1c). The organisation’s information technology (IT) team,
together with a FAIR expert, can assist in understanding the existing infrastructure (e.g., storage
server for data and metadata, long term longevity plan for metadata) (1d) and determining the
necessary adaptations required to accommodate the resource to be made FAIR (e.g., changes on
the data storage format of the EDC system).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Phase 2: Assessment of FAIRification infrastructure</title>
        <p>This phase addresses the resources to be made FAIR and the organisation’s currently available
FAIR supporting infrastructure. As shown in Figure 2, the resources to be made FAIR are
assessed (step 2a) to check if they can be retrieved (e.g., are they in a SQL server hosted locally?
In a USB stick at the researcher’s home ofice? Can the current EDC system be modified to
generate ontologised data?), understood (e.g., are the headers of CSV files documented? Are
the data elements collected by the current EDC system clear enough?) and if there are legal
constraints in place (e.g., limited access due privacy-sensitive data).</p>
        <p>Similarly, the current infrastructure that will accommodate the FAIR resource needs to be
reached and assessed (2c) to check if it can be used, if it needs to be adapted and/or if additional
infrastructure needs to be arranged. The type of infrastructure may vary depending on the type
of FAIR resource it is intended to support. For example, to make data FAIR, the infrastructure
may include storage servers for data and metadata, and data capturing systems (that might
have to be adapted). In the case of privacy-sensitive data, an access control system must be
incorporated. Similarly, to make an ontology FAIR, the infrastructure may involve an ontology
repository and a metadata server. In the case of software, it can include a software code
repository and a version control system.</p>
        <p>The primary aim of these steps is to ensure that both the resources to be made FAIR and the
current infrastructure intended to accommodate the FAIR resource do not pose any obstacles to
FAIRification. This involves verifying, for instance, the availability and capability of storage
servers to handle the data volume associated with FAIRification, among other considerations. If
any issues are identified in this phase, they must be addressed before continuing to the next
phase (steps 2b and 2d).</p>
        <p>Finally, at this stage, the team must have enough information to decide whether a retrospective
or and de novo FAIRification must be planned for the resources identified. For instance, if a
patient registry needs to make existing data FAIR, but also needs to start generating FAIR data
as it is collected, then both retrospective and de novo FAIRification will need to be planned.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Phase 3: Preparation of FAIRification stakeholders</title>
        <p>
          The third phase of the method focuses on identifying and preparing the people who will be
involved in the FAIRification project. For this, the list of the initial project collaborators is used.
The main aim of this task is to bridge the knowledge gap between domain and FAIR experts to
prepare them for subsequent phases. The motivation for this comes from the work of Neuhaus
&amp; Hastings [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], who suggests techniques to involve stakeholders in the ontology development
process. By engaging the project collaborators into each other’s domain, we reuse the authors’
proposed techniques of “creating micro-level consensus” (micro-level: project scope), which is
expected to establish a more inclusive participatory environment for the discussion of objectives.
        </p>
        <p>In this phase, the group of project collaborators is categorised into FAIR experts and domain
experts (3a). Then, relevant knowledge gaps between them are assessed to an extent that allows
for suficient understanding of each other’s expertise ( 3b). This will create a common “ground
language” for stakeholders to communicate their own objectives.</p>
        <p>To exemplify, FAIR experts involved in our example project (i.e., rare disease registry
FAIRification) could have a question-and-answer session with domain experts about common data
elements for rare disease registration [22]. Meanwhile, domain experts get a short lecture on the
basics about the FAIR principles and what can be expected and done with FAIR data. We outline
that, for the sake of expectation management, it is important to inform domain experts about
what is possible with FAIR and what should not be expected as output from a FAIRification
project. For instance, while FAIR data may facilitate it, a data visualisation dashboard is an
unusual output of FAIRification.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Phase 4: Identification of domain scope and groups of reuse stakeholders</title>
        <p>Phase 4 relies on the premise that reuse is the ultimate aim of FAIR, and therefore the
FAIRification objectives must consider eventual reuse case scenarios. As shown in Figure 3, the list of
project goals and research/business questions are input in this phase to identify and describe
the domain scope (4a). For instance, rare diseases are the domain of the rare disease registry
FAIRification project, while the scope refers to a subset of the domain that considers only the
terms of interest for the FAIRification project (e.g., information from patients with rare diseases
including treatment procedures may be within the scope, while other medical information
unrelated to the rare disease might be out of the scope).</p>
        <p>This phase also consists of identifying semantic types pertaining to the scope (4b). We refer
to semantic types as groups of concepts of similar meaning (e.g., pain is a semantic type group
that covers similar concepts such as discomfort, ache, and soreness). In our running example,
semantic types would include patient, treatment, diagnosis and genetic information. These would
also be useful in later stages of FAIRification (i.e., conceptual modelling of (meta)data). Next, on
step 4c, the semantic types and their definitions are discussed and agreed upon by the group of
domain experts. During the agreement process, they may identify additional semantic types to
be added to the list.</p>
        <p>In step 4d, the description of the domain and semantic types is used to identify reuse
stakeholders. To illustrate, a researcher and a healthcare provider are examples of stakeholders who
will reuse patient, diagnosis and treatment data from the rare disease patient registry. Next,
the expected goals of the reuse stakeholders when reusing the FAIR resource are predicted by
the FAIR project stakeholders (4e). For instance, using the data to “identify cohorts for clinical
trials” may be a goal of the researcher towards the rare disease patient registry. Other examples
of reuse stakeholders can be patient representatives, clinicians and healthcare providers. The
list of reuse stakeholders and their goals should also be validated with domain experts (4f ).</p>
        <p>Note that, in step 4d, it should not be expected a fully comprehensive list of stakeholders,
as it would be very dificult to predict all eventual reuse cases. However, the FAIRification
planning team should strive for creating a list that considers relevant expected cases. We also
point out that later project extensions to incorporate more reuse cases should be technically
feasible given the flexibility of FAIR resources.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Phase 5: FAIRification goals refinement and alignment to FAIR principles</title>
        <p>
          As depicted in Figure 4, the fifth phase of the method starts by reusing the list of semantic
types defined in the previous phase to identify competency questions (CQs) [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] that should be
answered by the FAIR resource (5a), including the metadata of the resource. In the context of
a FAIRification project, a CQ should be a question that cannot be answered without the FAIR
resource, or that can be answered in a significantly easier manner with the FAIR resource. We
suggest that CQs elicited in this step should be complex enough to connect and explore the
relationship between diferent semantic types. Table 2 shows some examples of CQs that can
be defined for the semantic types exemplified in Section 3.4. In step 5b, the CQs are assigned
to related stakeholders (i.e., reuse stakeholders and relevant project stakeholders) and further
refined as objectives ( 5c). These objectives can be identified by asking why a certain CQ needs
to be answered and how it can be answered. Some objectives are also exemplified in Table 2.
        </p>
        <p>The objectives identified from the CQs are then aligned with related principles ( 5d). For this
step, it should be identified which and how a FAIR principle will support achieving a specific
objective. For instance, the objective “public awareness of rare diseases is improved” (Figure 5),
which is further refined until it can be realised by the task “collect and publish demographic
statistics”, may be supported by F2 (rich metadata to make the patient registry findable) and R1.1
(data licence to allow reuse of the data for demographic statistics). Meanwhile, other principles
(e.g., F1) may not be prioritised for this specific objective.</p>
        <p>To facilitate the management of objectives, we suggest the use of goal-modelling techniques
such as iStar [23], which helps to capture the stakeholders intentions and their relationships
in a structured way. Models created with iStar include concepts such as actors, goals, tasks,
resources, and relationships such as decomposition and contribution links. The reader is referred
to [23] for further information on iStar.</p>
        <p>The final step of this phase consists of using the list of semantic types to identify related
FAIRification projects ( 5e) through, for instance, the use of FAIR Implementation Profiles
(FIPs) [24] or catalogues such as FAIRSharing [25]. FIPs are specifications of implementation
solutions for realising the FAIR principles in a specific context or domain, and their use is
intended to foster convergence on FAIR implementation decisions [24]. Another example of
a knowledge source for implementation solutions includes the Smart Guidance RD Wizard,
a questionnaire-based tool to guide data stewards in making rare disease patient registries
FAIR [26]. In the context of GO-Plan, related projects can support collecting implementation
solutions that can be reused in the FAIRification project. The EJP RD project [ 21] is such a
project to our running example.</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.6. Phase 6: Decision making</title>
        <p>The sixth and last phase of the method starts by prioritising feasible objectives (6a) given
the project requirements (e.g., data privacy) and constraints (e.g., budget, deadline, available
expertise). At this point, prioritisation also includes removing objectives that are not feasible,
may not be supported by FAIR principles or are not related to FAIRification. Then, the prioritised
objectives are further refined ( 6b) and tasks required to realise them are elicited. Here it is
recommended that the team estimates the cost and time associated with the elicited tasks to
assist in further prioritisation of goals given the project requirements (6c).</p>
        <p>Next, the most appropriate solutions for prioritised objectives are identified and selected
considering the project goals, requirements, expertise and the limitations of available supporting
infrastructure (6d). This step can be supported by reusing solutions from the similar projects
identified in step 5e, by consulting experts on FAIR or by querying resources such as
FAIRSharing and the Smart Guidance RD Wizard. Next, the necessary (meta)data for achieving the
identified tasks are listed ( 6e) and described in the goal diagrams as resources, as exemplified in
Figure 5. Subsequently, the team needs to assess whether there is a need to adapt the supporting
infrastructure for the prioritised goals and, if so, add goals to address this need (6e). Finally, the
expertise required for the implementation of the selected solutions (6f ) is defined.</p>
        <p>To illustrate, the reuse of the EJP RD Metadata Model is a possible implementation choice
for the objectives depicted in Figure 5 (in the context of F2 – “Find demographic data about
patients”) given the project requirements, and a semantic modelling expert would be a required
expertise to support reusing this solution.</p>
        <p>At this point, the goal diagram should contain enough information to inform and guide
FAIRification. The FAIRification objectives, tasks and chosen implementation solutions can now
be seen as actions to be taken towards realising FAIRification. It is upon the experts conducting
the FAIR project to prioritise tasks and define implementation cycles and evaluation activities.
We suggest using a FAIRification workflow to organise the FAIRification process that follows.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Final remarks</title>
      <p>The method presented in this paper is defined with sequential phases and steps. However,
we have observed that real-world applications, such as the one described in Sales et al. [16],
may benefit from an agile approach. In this case, the method can be fitted into one iteration
and executed several times, or have its phases broken down into diferent cycles that can be
executed iteratively until the outputs of those phases are satisfactory. For instance, in a first
iteration, the process of creating the competency question can raise the need to include more
semantic types, which can be addressed during the method’s re-execution in a second cycle, or
in a re-execution of phase 5. It is up to the FAIRification team to decide how many iterations
should be performed considering the project constraints (especially budget and time).</p>
      <p>Additionally, distinct FAIRification iterations can be tailored to address the specific needs and
considerations of diferent stakeholders, thereby defining diferent levels of FAIR and related
aspects for them. That is particularly valuable, for instance, when dealing with sensitive data
(e.g. some types of users have access to diferent portions of data) or with FAIRification projects
involving non-public data (e.g. from private companies), where certain reuse stakeholders might
have limited access to the (meta)data.</p>
      <p>We acknowledge the need for a more detailed evaluation of the expected benefits of the
method when compared to ad hoc FAIRification. We are currently working on evaluating
GO-Plan from a usability perspective, where we will study the perception of users when using
the method (i.e., “is it easier and more eficient to define FAIRification objectives using GO-Plan
compared to ad hoc FAIRification planning?”). In addition, we emphasise that the method is
based on techniques from software engineering that have already been evaluated and used in
several real-world applications (e.g., [27, 28]).</p>
      <p>When applying the method to a real-world use case [16], we observed a significant influence
of the definition of reuse stakeholders on the results of FAIRification, particularly in identifying
which (meta)data concepts should be collected and published, as well as considerations regarding
licensing and provenance. We attribute this impact to the fundamental emphasis of FAIR on
facilitating reusability and assert that optimising the resource for reuse cases is key to efective
FAIRification. Furthermore, we also observed that using goal model diagrams has facilitated the
communication among collaborators.</p>
      <p>When comparing the real-world use case with [16] and without [17] the use of our method,
we noticed that our approach led to more informed and clearer decision-making and evaluation
of the FAIRness of the catalogue. The stakeholders were able to prioritise solutions based on a
comprehensive understanding of the relationship between objectives and the FAIR principles. To
illustrate, the use of our method resulted in a re-definition of metadata concepts to be collected,
a reprioritisation of the principles (e.g., more attention was given to R1), and the inclusion
of FAIR supporting infrastructure such as the FDP. Finally, we observed that the objectives
helped stakeholders in establishing achievement criteria for principles that lacked suficient
precision. For instance, the team was able to define a metadata set that would satisfy the “data
are described with rich metadata” (F2) principle by ensuring that it supported all prioritised
goals from the reuse stakeholders.</p>
      <p>The main aim of the work presented in this paper is to help all FAIR enthusiasts to better define
clear FAIRification objectives and plans that can lead to successful FAIRification. Nonetheless,
we argue that communities should actively endeavour to share their FAIRification planning
artefacts (e.g., goal diagrams, implementation decisions, FIPs) in order to accelerate standards
convergence, disseminate solutions to implementation challenges, and share experiences so
that others can prepare and execute FAIRification faster and more seamlessly. To support this,
we propose that FAIRification plans, including goals and mappings to related principles, should
also be made FAIR. In addition to that, we emphasise the publication of FAIR implementation
decisions (i.e. FIPs) as an efective means to gradually diminish the work for subsequent projects
and (re)users. This will also allow future work to focus on creating a catalogue of FAIRification
plans and associated concrete tasks that can lead to improved automation.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We thank the LUMC Biosemantics and the EJP RD FAIRification Stewards groups for constant
feedback on this research. This initiative has received funding from the European Union’s
Horizon 2020 research and innovation programme under grant agreement N°825575 and the
Trusted World of Corona (TWOC; LSH Health Holland).
[13] R. de Almeida Falbo, Sabio: Systematic approach for building ontologies., Onto.</p>
      <p>Com/odise@ Fois 1301 (2014).
[14] C. H. Bernabé, L. Thielemans, C. Carta, et al., Building expertise on FAIR through evolving
Bring Your Own Data (BYOD) workshops: Describing the data, software, and management
focused approaches and their evolution, 2023. Manuscript in preparation.
[15] N. Queralt-Rosinach, R. Kaliyaperumal, C. H. Bernabé, et al., Applying the FAIR principles
to data in a hospital: challenges and opportunities in a pandemic, Journal of Biomedical
Semantics (2022).
[16] T. P. Sales, P. P. F. Barcelos, C. M. Fonseca, et al., A FAIR catalog of ontology-driven
conceptual models, 2023. Manuscript submitted to Data &amp; Knowledge Engineering.
[17] P. P. F. Barcelos, T. P. Sales, M. Fumagalli, et al., A FAIR model catalog for ontology-driven
conceptual modeling research, in: Conceptual Modeling. ER 2022, volume 13607, Springer,
2022, p. 3–17.
[18] G. Guizzardi, C. M. Fonseca, A. B. Benevides, et al., Endurant types in ontology-driven
conceptual modeling: Towards OntoUML 2.0, in: Conceptual Modeling. ER 2018, volume
11157, Springer, 2018, p. 136–150.
[19] G. Guizzardi, A. Botti Benevides, C. M. Fonseca, et al., UFO: Unified Foundational Ontology,</p>
      <p>Applied Ontology 17 (2022) 167–210.
[20] OMG, Business Process Model and Notation (BPMN), Version 2.0, 2011. URL: http://www.</p>
      <p>omg.org/spec/BPMN/2.0.
[21] European Joint Programme for Rare Diseases, EJP-RD VP Resource Metadata Schema,
https://github.com/ejp-rd-vp/resource-metadata-schema, 2021. Accessed on April 24, 2023.
[22] EU RD Platform, Set of common data elements, https://eu-rd-platform.jrc.ec.europa.eu/
set-of-common-data-elements_en, accessed 2023.
[23] F. Dalpiaz, X. Franch, J. Horkof, iStar 2.0 language guide, arXiv preprint arXiv:1605.07767
(2016).
[24] E. Schultes, B. Magagna, K. M. Hettne, et al., Reusable FAIR implementation profiles as
accelerators of FAIR convergence, in: Advances in Conceptual Modeling. ER 2020, volume
12584, Springer, 2020.
[25] S.-A. Sansone, P. McQuilton, P. Rocca-Serra, et al., FAIRsharing as a community approach
to standards, repositories and policies, Nature Biotechnology (2019).
[26] P. van Damme, P. Alarcón Moreno, A. Cámara Ballesteros, C. H. Bernabé, C. M. A.</p>
      <p>Le Cornec, B. Dos Santos Vieira, K. J. van der Velde, S. Zhang, C. Carta, R. Cornet, P. A.
’t Hoen, A. Jacobsen, M. A. Swertz, M. Roos, N. Benis, A resource for guiding data
stewards to make european rare disease patient registries fair, Data Science Journal (2023).</p>
      <p>Manuscript submitted for publication.
[27] C. Pacheco, I. García, M. Reyes, Requirements elicitation techniques: A systematic literature
review based on the maturity of the techniques, IET Software (2018).
[28] J. Horkof, F. B. Aydemir, E. Cardoso, et al., Goal-oriented requirements engineering: An
extended systematic mapping study, Requirements engineering 24 (2019) 133–160.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Thomer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Akmon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>York</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Tyler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Polasek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lafia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hemphill</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Yakel,</surname>
          </string-name>
          <article-title>The craft and coordination of data curation: Complicating workflow views of data science</article-title>
          ,
          <source>Proceedings of the ACM on Human-Computer Interaction</source>
          <volume>6</volume>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Wilkinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumontier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. J.</given-names>
            <surname>Aalbersberg</surname>
          </string-name>
          , et al.,
          <article-title>The FAIR guiding principles for scientific data management and stewardship, Scientific data (</article-title>
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jacobsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kaliyaperumal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. O.</given-names>
            <surname>Bonino da Silva Santos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mons</surname>
          </string-name>
          , E. Schultes,
          <string-name>
            <given-names>M.</given-names>
            <surname>Roos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <article-title>A generic workflow for the data FAIRification process</article-title>
          ,
          <source>Data Intelligence</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K. H.</given-names>
            <surname>Groenen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jacobsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Kersloot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. dos Santos</given-names>
            <surname>Vieira</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. van Enckevort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kaliyaperumal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. L.</given-names>
            <surname>Arts</surname>
          </string-name>
          , P. A.
          <string-name>
            <surname>t Hoen</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Cornet</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Roos</surname>
          </string-name>
          , et al.,
          <article-title>The de novo FAIRification process of a registry for vascular anomalies</article-title>
          ,
          <source>Orphanet Journal of Rare Diseases</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Welter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Juty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rocca-Serra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Henderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Strubel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. T.</given-names>
            <surname>Giessmann</surname>
          </string-name>
          , I. Emam,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gadiya</surname>
          </string-name>
          , et al.,
          <article-title>Fair in action-a flexible framework to guide fairification</article-title>
          ,
          <source>Scientific Data</source>
          <volume>10</volume>
          (
          <year>2023</year>
          )
          <fpage>291</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B. dos Santos</given-names>
            <surname>Vieira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Bernabé</surname>
          </string-name>
          , I. Henriques,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Camara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A. R.</given-names>
            <surname>García</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. van der Velde</surname>
          </string-name>
          , P. van Damme,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Benis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Strubel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schoots</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. L'Henaf</surname>
          </string-name>
          , P. '
          <string-name>
            <surname>t Hoen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Roos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Jacobsen</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Cornet</surname>
            ,
            <given-names>M. D.</given-names>
          </string-name>
          <string-name>
            <surname>Wilkinson</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Schaefer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Swertz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Jetten</surname>
          </string-name>
          ,
          <article-title>Critical steps towards large-scale implementation of the FAIR data principles</article-title>
          ,
          <year>2023</year>
          . URL: https://doi.org/10.5281/zenodo.7867293.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Pressman</surname>
          </string-name>
          ,
          <article-title>Software engineering: A practitioner's approach</article-title>
          , 7th ed.,
          <string-name>
            <surname>McGraw-Hill</surname>
          </string-name>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>A. Van Lamsweerde</surname>
          </string-name>
          ,
          <article-title>Goal-oriented requirements engineering: A guided tour</article-title>
          ,
          <source>in: Proceedings fifth ieee international symposium on requirements engineering</source>
          , IEEE,
          <year>2001</year>
          , pp.
          <fpage>249</fpage>
          -
          <lpage>262</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B. dos Santos</given-names>
            <surname>Vieira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Bernabé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , et al.,
          <article-title>Towards FAIRification of sensitive and fragmented rare disease patient data: Challenges and solutions in european reference network registries</article-title>
          ,
          <source>Orphanet Journal of Rare Diseases</source>
          <volume>17</volume>
          (
          <year>2022</year>
          )
          <fpage>436</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Freeman</surname>
          </string-name>
          ,
          <article-title>Strategic management: A stokcholder approach</article-title>
          , Pitman,
          <year>1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Grüninger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Fox</surname>
          </string-name>
          ,
          <article-title>The role of competency questions in enterprise engineering, Benchmarking-Theory and practice (</article-title>
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>F.</given-names>
            <surname>Neuhaus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hastings</surname>
          </string-name>
          ,
          <article-title>Ontology development is consensus creation, not (merely) representation</article-title>
          , Applied Ontology (
          <year>2022</year>
          ). Preprint.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>