=Paper= {{Paper |id=Vol-2878/paper5 |storemode=property |title=Approaches to the Integration of TRUST and FAIR Principles |pdfUrl=https://ceur-ws.org/Vol-2878/paper5.pdf |volume=Vol-2878 |authors=Jaime Delgado,Celia Alvarez Romero,Alicia Martínez García,Carlos Luis Parra Calderón }} ==Approaches to the Integration of TRUST and FAIR Principles== https://ceur-ws.org/Vol-2878/paper5.pdf
Approaches to the integration of TRUST and FAIR principles
Jaime Delgado 1, Celia Alvarez-Romero 2, Alicia Martínez-García 2, and Carlos Luis Parra-
Calderón 2
1
  Universitat Politècnica de Catalunya (UPC BarcelonaTECH), Campus Nord, C/ Jordi Girona, 1-3, Barcelona,
08034, Spain
2
  Group of Research and Innovation in Biomedical Informatics, Biomedical Engineering and Health Economy,
Institute of Biomedicine of Seville, IBiS / Virgen del Rocío University Hospital / CSIC / University of Seville,
Avda. Manuel Siurot s/n, Seville, 41013, Spain


                Abstract
                FAIR4Health, an EU’s funded project, promotes the application of FAIR principles (Findable,
                Accessible, Interoperable, Reusable) in data derived from publicly funded health research
                initiatives to share and reuse them in the European Union Health Research community,
                defining an effective EU-wide strategy for the use of FAIR in Health.
                This paper analyses how to apply FAIR principles to “trust” in the context of the open-source
                software development of FAIR4Health. The FAIRification process and TRUST principles are
                discussed and related. The paper tries to open a new view on the trustworthiness of data access
                using open-source software.

                Keywords 1
                FAIR Principles, Health Data, TRUST Principles


1. Introduction
   The FAIR Data principles [1] aim to ensure that data are shared in a way that enables and enhances
reuse by humans and machines. Although FAIR (Findable, Accessible, Interoperable, Reusable)
emerged from a workshop for the life science community, the principles are intended to be applied to
data and metadata from all disciplines.
   Since the formal release via the FORCE11 community [1], the FAIR data principles have been
adopted by several funders and governments worldwide. The European Commission data management
guidelines were updated in 2017 to introduce the notion of FAIR. On the other hand, the EOSC
Declaration [2] launched following the Summit in June 2017 and the recent Staff Working Document
proposing an Implementation Roadmap for the European Open Science Cloud [2] both emphasize the
central role of FAIR data.
   FAIR is being adopted by a diverse range of research disciplines. Several groups have been assessing
uptake to date and the challenges being encountered. FAIR4Health [3] and other projects add to the
State-of-the-Art by documenting good practices and applying them to other domains where possible.


2. FAIR principles and its software implementation
   FAIR4Health has designed and implemented a workflow [4] to apply the FAIR principles to health
research data, based on the FAIRification process of GO FAIR [5], but addressing the ethical, legal and

First workshop on trustworthy software and open source, March 23-25, 2021, Virtual Conference
EMAIL: jaime.delgado@upc.edu (A. 1); celia.alvarez@juntadeandalucia.es (A. 2); alicia.martinez.garcia@juntadeandalucia.es (A. 3);
carlos.parra.sspa@juntadeandalucia.es (A. 4)
ORCID: 0000-0003-1366-663X (A. 1); 0000-0001-8647-9515 (A. 2); 0000-0001-5614-7747 (A. 3); 0000-0003-2609-575X (A. 4)
             © 2021 Copyright for this paper by its authors.
             Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
             CEUR Workshop Proceedings (CEUR-WS.org)
technical aspects that health data include due to their nature. These aspects have been analysed for
sensitive data, and FAIRification tools, based on the use of the HL7 FHIR standard [6], have been
developed to obtain FAIR data from data resulting from biomedical research.
   FAIRification tools are standalone, open-source desktop applications developed by the
FAIR4Health project. They include:
        • Data Curation Tool: used to connect the health data sources, which can be in various formats
           (Excel files, CSV files, SQL databases), and migrate data into an HL7 FHIR Repository.
           The tool shows the available FHIR profiles to the user so that he/she can perform mappings
           appropriately. The tool can also contact a Terminology Server (which is actually another
           HL7 FHIR Repository) so that data fields can be annotated if coding schemes such as ICD10
           [7] or SNOMED-CT [8] are in use.
        • Data Privacy Tool: aims at handling the privacy challenges exposed by the sensitive health
           data. It is designed to work on an HL7 FHIR API so that it can be used on top of any standard
           FHIR Repository as a data de-identification, anonymization, and related actions toolset. The
           tool accesses FHIR resources, presents metadata to the user, guides the user about the
           configuration to be applied and then outputs the processed FHIR resources.
   The architecture of the FAIR4Health ecosystem, as depicted in Figure 1, shows the high-level
process and interoperability of different components.




Figure 1: Overview of the architecture and components of the FAIR4Health ecosystem (FAIR4Health
agents and FAIR4Health platform)

   The FAIR4Health agents, located at the data owner's facilities, enable the application of the
FAIRification process to local datasets through their user-driven Extraction and Loading
Transformation functionalities. At the end of this process, the datasets are standardized, curated and
mapped to domain vocabularies and ontologies. The agents also host instances of Privacy Preserving
Distributed Data Mining services so that they can operate locally without the need to host these datasets
outside the owner's facilities.
   Furthermore, the FAIR4Health platform functions as a facilitator to deploy innovative data-based
services, hosting the repository of actionable data mining models that could be executed. The exchange
of datasets is properly secured on an end-to-end basis to minimize the risk of information disclosure.
3. TRUST principles approach
  This section analyses the concepts related to the TRUST principles and then maps them to the
FAIRification process.

3.1.    TRUST principles concepts
   One of the possible approaches to “trust” is through the so-called “TRUST principles”. We first
consider the approach taken at the eTRICKS project [9]. This project developed a “Recommended Data
sharing model” for which a specific version of TRUST principles was defined.
   The defined TRUST principles by eTRICKS are:
        • Transparency: data subjects are informed of data users’ requests, if they wish, and data
            breaches, when required, according to what the GDPR establishes [10].
        • Reciprocity and reward: the contribution in a study of data subjects, data providers, and data
            users is acknowledged or rewarded.
        • Universality: the use of data is open to any registered data user, provided that use is
            authorized by a national law and/or a data subject.
        • Security: data are processed in a controlled environment. Data users and their requested
            processes are recorded for auditing purposes.
        • Tiered data use: the authorization of data use depends on the data type, the analysis purpose,
            the data user’s profile, the analytical algorithm that a data user wants to use, and the data
            subject’s will.
   However, the TRUST principles defined by eTRICKS is not the only approach. Another relevant
definition of TRUST principles [11] comes from the RDA (Research Data Alliance) [12]. The defined
TRUST principles by the RDA are:
        • Transparency: to provide publicly accessible evidence of the services offered by a
            repository.
        • Responsibility: a commitment to provide reliable data services and authenticity and integrity
            of data.
        • User focus: to ensure the implementation and enforcement of the standards and data
            management norms of a specific user community.
        • Sustainability: the capability to support long-term data preservation and use.
        • Technology: the infrastructure and capabilities to support secure, persistent, and reliable
            services.
   These 5 principles by the RDA have several associated key concepts, such as data provenance and
data curation for Transparency, data quality and sharing of Responsibility for the 2nd principle,
engagement and responsiveness for a broad User focus, implication of all the partners for Sustainability,
and features such as reliability, flexibility, scalability, transferability, security and agility for
Technology.
   In the context of FAIR principles, we try to combine and integrate the two different approaches to
TRUST principles. One possibility is the following:
        • Transparency: data subjects informed (users’ requests, data breaches, publicly accessible
            evidence of the services).
        • Responsibility & Reciprocity: data subjects, providers, users; Commitment to provide
            reliable data services.
        • Universality and standards: access for registered users (if authorized) and use of (implement
            and enforce) standards.
        • Sustainability: capability to support long-term data preservation and use.
        • Technology: infrastructure and capabilities to support the repository operations, including
            Security (controlled data processing and auditing) and Tiered data use.
3.2.    TRUST principles and FAIRification
   In order to select the specific TRUST principles concepts to further analyse them in the context of
software development, we need to consider other issues, such as the FAIRification workflow, so we
could map TRUST concepts to workflow steps. As mentioned before, FAIR4Health has adopted its
own workflow.
   The different steps of the FAIR4Health’s FAIRification workflow [4] could be summarized as:
   •     1. Raw data analysis
   •     2. Data curation & validation
   •     3. Data de-identification / anonymization
   •     4. Semantic modeling
   •     5. Make data linkable
   •     6. License attribution
   •     7. Data versioning
   •     8. (Meta)data aggregation
   •     9. Archiving
   An example of mapping between TRUST principles (adapted to the FAIR context, as described
before at the end of subsection 3.1) and FAIRification workflow steps could be:
        • Transparency: steps 1 and 9.
        • Responsibility & Reciprocity: steps 1, 6 and 8.
        • Universality and standards: steps 2, 3, 5 and 7.
        • Sustainability: steps 2, 7 and 9.
        • Technology: steps 3, 4, 5 and 6.
   Other mapping options could be possible depending on the specific use cases.
   Nevertheless, mapping TRUST principles to the FAIRification steps is not the only option we
consider in order to develop a methodology for trustable open-source software implementations. It is
also interesting to consider the steps related to the process of accessing information offered through
FAIR principles, which are in any case related to FAIRification. A simple example of steps could be
Data Acquisition, Data Processing/normalization and Data Request (by patient, health professional or
researcher).


4. Conclusions
    The FAIR principles open a new approach to the sharing of information in different areas of research.
The FAIR4Health project focuses in promoting and implementing these principles in the context of
health data. The software being developed is open-source.
    The paper rises a new issue in the consideration of trustworthiness, not only for the software but also
for the information to share. Our approach to trust is based on the so-called TRUST principles, to which
there is no full agreement yet. Therefore, we are trying to refine these TRUST principles in the context
of the FAIRification process. The result will be a methodology to keep trust and FAIR principles while
developing open-source software for providing access to health data. We still need more work to
produce something formal, but we have tried to define the approach to follow.
    Finally, concerning future work, apart from the continuation of the analysis started here, we are
working on another methodology for adding security and privacy to FAIR-oriented systems and to the
different associated steps.
5. Acknowledgements
   Work performed in the framework of FAIR4Health project, with funding from the European Union’s
Horizon 2020 programme under grant agreement number 824666. In addition, work presented in this
paper has been partially supported by the Generalitat de Catalunya (2017 SGR 1749).


6. References

[1] The FAIR Data Principles. URL: https://www.force11.org/group/fairgroup/fairprinciples.
[2] The       EU's    open     science    policy.    URL:       https://ec.europa.eu/info/research-and-
     innovation/strategy/goals-research-and-innovation-policy/open-science_en.
[3] FAIR4Health Project. URL: https://www.fair4health.eu.
[4] Sinaci, A. A., Núñez-Benjumea, F. J., Gencturk, M., Jauer, M. L., Deserno, T., Chronaki, C., ... &
     Parra-Calderón, C. L. “From raw data to FAIR data: the FAIRification workflow for health
     research”. Methods of Information in Medicine, 59(S 01), e21-e32 (2020).
[5] GO FAIR initiative. URL: https://www.go-fair.org.
[6] HL7 FHIR website. URL: http://hl7.org/fhir.
[7] World Health Organization. Classification of Diseases (ICD). URL: https://www.who.int/classifi-
     cations/icd/en.
[8] Systematized Nomenclature of Medicine -- Clinical Terms (SNOMED CT). URL:
     http://www.snomed.org.
[9] eTRIKS (European Translational Information and Knowledge Management Services). URL:
     https://www.etriks.org.
[10] GDPR (General Data Protection Regulation): Regulation (EU) 2016/679 of the European
     Parliament and of the Council of 27 April 2016, Official Journal of the European Union, L 119, 4
     May 2016. URL: https://gdpr-info.eu.
[11] Lin, D. (2019). The TRUST Principles for Trustworthy Data Repositories – An Update.
     RDA/WDS Repository Certification IG. In: RDA/WDS Certification of Digital Repositories IG.
     URL: https://www.rd-alliance.org/groups/rdawds-certification-digital-repositories-ig.html.
[12] Research Data Alliance. URL: https://www.rd-alliance.org.