Motivation

modelling for Duchenne and Becker Muscular Dystrophy rare diseases

Pietro Spitali

P.Spitali@lumc.nl 1 2

Núria Queralt-Rosinach

1 2

Pablo Perdomo-Quinteiro

1 2

Sergiu Siminiuc

sergiu@duchennedatafoundation.org 0 1

Paraskevi Sakellariou

sakellariou.elvina@duchennedatafoundation.org 0 1

Marco Roos

M.Roos@lumc.nl 1 2 0 Duchenne Data Foundation , Veenendaal , the Netherlands 1 FAIR data , semantic modelling, BYOD, Duchenne Muscular Dystrophy , Becker Muscular Dystrophy 2 Leiden University Medical Center , Leiden , the Netherlands

2023

1 3

The BIND project is a EU-funded project that attempts to improve the characterisation of brain involvement in Duchenne and Becker Muscular Dystrophies (DMD and BMD). Here, we present our ongoing work on making multimodal data FAIR, the semantic models, and discuss challenges and opportunities based on our FAIRification experience. SWAT4HCLS 2023: The 14th International Conference on Semantic Web Applications and Tools for Health Care and Life

Motivation

The BIND project is a EU-funded project that attempts to improve the characterisation of brain involvement in Duchenne and Becker Muscular Dystrophies (DMD and BMD). To achieve this, 19 organizations across Europe and Japan are collaborating and obtaining phenotypic and molecular data such as transcriptomic, proteomic, clinical, behavioral, and MRI brain images. It is crucial to make this information Findable, Accessible, Interoperable and Reusable (FAIR), as making data FAIR will allow machines to exchange, integrate and analyze these data within the consortium and with other external FAIR data sources. Additionally, making data FAIR improves transparency, reproducibility, and maximizes the use of scientific output. Ultimately, FAIR data will advance our understanding of DMD and BMD and drive innovation on new therapeutic strategies. In order to achieve FAIRification, i.e., the process of making data FAIR, data must be properly described and annotated using standard metadata models and ontologies used by the DMD and BMD community for research, and must be stored in a persistent and Sciences ∗Corresponding author. accessible repository. Additionally, data must be available in machine-readable formats and should follow best practices for data management and sharing.

The FAIRification process has been developed together with EBRAINS, a digital research infrastructure created by the EU-funded Human Brain Project with the objective of understanding human brain function and disease. By making use of EBRAINS standards, the project can benefit from all the information gathered in EBRAINS, and at the same time, other researchers can make use of the data provided by BIND. The FAIRification of the data has been done using Semantic Web technologies (RDF, OWL and SHEX) and reusing the semantic models provided by the European Joint Programme on Rare Diseases (EJP RD). Our approach is to use FAIR implementations selected by the rare disease community. To this end, we worked in collaboration with the EJP RD, as a key driver project on FAIRification of rare disease resources for research and a large representation of the rare disease community. Our method is based on the FAIRification workflow [ 1 ], where the first step is to assess the FAIR status of the data.

Several Bring Your Own Data (BYOD) workshops [ 2 ] have been organized throughout 2022, with the two-fold objective of assessing the FAIR status of the data and explaining the meaning of the data to FAIR experts and bringing the FAIRification methodology closer to the data owners. In these workshops, we as FAIR experts needed to fully understand the data to correctly perform the FAIRification process. One of the challenges is that data owners are often unfamiliar with the FAIRification process, and viceversa, FAIR experts are often not familiar with domain specific knowledge. Another challenge is to reach a consensus of what the data mean and how best to represent it. The modelling has been developed following the semantic core data model of the set of common data elements for rare disease patient registries provided by the EJP RD [ 3 ]. Here, we present our ongoing work on making multimodal data FAIR, the semantic models we are developing to represent data such as patient phenotypic behavioral information compiled in the form of questionnaires or single cell RNA-seq data obtained in preclinical mouse models; and discuss challenges and opportunities based on our FAIRification experience so far.

Acknowledgments

We thank all BIND team, the data owners and domain experts. We also thank the EJP RD for providing the semantic models used in this project. This project was supported by a grant from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 847826 (Brain Involvement iN Dystrophinopathies (BIND)).

[1]

Jacobsen , et al., A Generic Workflow for the Data FAIRification Process , Data Intelligence 2 ( 2020 ) 56 - 65 . URL: https://direct.mit.edu/dint/article/2/1-2/56/9988/ A-Generic- Workflow-for-the-Data-FAIRification . doi:1 0 . 1 1 6 2

/ D I N T _

A _ 0 0 0 2 8 .

[2]

Hooft , et al., ELIXIR-EXCELERATE D5 . 3: Bring Your Own Data (BYOD) ( 2019 ). URL: https://zenodo.org/record/3207809. doi:1 0 . 5 2 8 1 / Z E N O D O . 3 2 0 7 8 0 9.

[3]

Kaliyaperumal , et al., Semantic modelling of common data elements for rare disease registries, and a prototype workflow for their deployment over registry data , Journal of