<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Managing food and microbiome studies data using Fairspace, a flexible and FAIR data management platform</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ewelina Grudzien</string-name>
          <email>ewelina@thehyve.nl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eelke van der Horst</string-name>
          <email>eelke@thehyve.nl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Frank van den Bergh</string-name>
          <email>frank@thehyve.nl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria H. Traka</string-name>
          <email>maria.traka@quadram.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Duncan Ng</string-name>
          <email>Duncan.Ng@quadram.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Falk Hildebrand</string-name>
          <email>Falk.Hildebrand@quadram.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chris T. Evelo</string-name>
          <email>chrisevelo@maastrichtuniversity.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Susan L. M. Coort</string-name>
          <email>susan.coort@maastrichtuniversity.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Duygu Dede Şener</string-name>
          <email>d.dedesener@maastrichtuniversity.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elisa Cirillo</string-name>
          <email>elisa@thehyve.nl</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department Bioinformatics - BiGCaT NUTRIM, Maastricht University</institution>
          ,
          <addr-line>Maastricht</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Quadram Institute Bioscience</institution>
          ,
          <addr-line>Norwich Research Park</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>The Hyve B.V.</institution>
          ,
          <addr-line>Arthur van Schendelstraat 650, Utrecht</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Fairspace is an open-source research data management platform that adheres to the FAIR principles. The Hyve created Fairspace in 2016 and has developed it ever since, customizing it for several organizations' use cases. We present the implementation of this tool within the FNS-Cloud consortium, in which Fairspace became the user browser that allows microbiome and food data exploration within public resources mapped to a common (meta)data model and vocabularies/ontologies.</p>
      </abstract>
      <kwd-group>
        <kwd>1 FAIR</kwd>
        <kwd>Research Data Management</kwd>
        <kwd>Microbiome</kwd>
        <kwd>Food &amp; Nutrition Security</kwd>
        <kwd>Fairspace</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Fairspace is an open-source research data management platform that adheres to the FAIR
principles. It offers a collaborative environment for managing research data (files, collections)
and a metadata catalog with flexible data model and search interface. It is built on semantic
web technologies, such as RDF and SHACL, and it is FAIR-by-design. The Hyve created
Fairspace in 2016 and has developed it ever since, customizing it for several organizations’
use cases, such as for Institut Curie, a cancer research hospital in Paris, and the Food Nutrition
Security Cloud (FNS-Cloud) microbiome use case, which is presented here.</p>
      <p>.</p>
    </sec>
    <sec id="sec-2">
      <title>2. The FNS-Cloud challenges</title>
      <p>One of the ambitions of the FNS-Cloud consortium is to reduce the fragmentation of food &amp; nutrition
security (FNS) research resources such as datasets and tools. Most of these resources have been
developed independently, with different interfaces or data formats. Therefore, the FNS-Cloud project
aims to set up a first-generation food &amp; nutrition security cloud that integrates already existing research
resources and shares these in a uniform manner in the cloud. Several demonstrators are planned in the
project in order to provide the use cases for guidance to properly develop and integrate the FNS-Cloud
tools. The focus for Fairspace development concerns a Microbiome Demonstrator, in which a researcher
wants to find studies regarding diet interventions and wants to combine them with gut microbiome data
for further analysis. In addition, the consortium partners want to leverage the collaboration with the
ELIXIR Infrastructure, which already includes several European food and nutritions resources.
However, ELIXIR does not have a solution for food and microbiome data yet that allows users to find
those data across multiple public data sources for analysis. With this focus the Hyve designed a
customized solution using Fairspace as the main component that could potentially fill this gap.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Solution Specification</title>
      <p>There are several aspects of Fairspace that make it the right solution for the above
challenges. Firstly, the tool uses a (semantic) metadata model which was developed by the
consortium to facilitate semantic data integration: FNS-Harmony. In particular, FNS-Harmony
was used to map data source-specific fields and entities to classes and attributes in the
ontology. In addition, The Hyve developed a SHACL model based also on specific ontologies
and user requirements. Fairspace uses this model as its content data model, for validation (data
integrity) and user interface generation. With the above aspects the (meta)data search can be
performed both by users and machines.</p>
      <p>Secondly, The Hyve implemented in Fairspace a set of ETL (extract, transform, load)
processes working together in order to fetch data from several public sources such as: ENA,
MGnify, MetaboLights and dbNP. The data is mapped to the common data model defined in
Fairspace, using selected ontologies and vocabulary. Lastly, a JupyterHub environment
integrated with Fairspace allows users to analyse metadata and data using R, Python or Julia.
It includes direct access to a shared Fairspace storage of predefined scripts for data
preprocessing and analysis, as well as metadata selected in Fairspace search interface. Users can
access the harmonized metadata through sparql and custom API and (file) data.
This work is currently still in progress since the FNS-Cloud project will run until September
2023, and the ETL customization is not open source.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions</title>
      <p>Fairspace has proven to be an easy-to-use open-source research data management tool that
is able to resolve different types of data management challenges faced by different
organizations, and it does it in a way that follows the FAIR principles.</p>
      <p>In particular, for the FNS-Cloud consortium, Fairspace became the user browser that allows
microbiome and food data exploration within public resources mapped to a common
(meta)data model and vocabularies/ontologies. Using the built-in faceted search interface the
researcher can export or analyze selected data in a secure analysis environment such as Jupyter
Notebook. Finally, in line with the open source philosophy The Hyve takes care to develop
code which is well readable, easy to maintain and with a high test coverage.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Acknowledgements</title>
      <p>Food Nutrition Security Cloud (FNS-Cloud) has received funding from the European
Union’s Horizon 2020 Research and Innovation programme (H2020-EU.3.2.2.3. – A
sustainable and competitive agri-food industry) under Grant Agreement No. 863059 –
www.fns-cloud.eu</p>
      <p>We thank all the other partners of FNS-Cloud that have contributed to discussions and
collateral work related to the Microbiome Demonstrator usecase: Institut Jozef Stefan,
University of Florence, Premotec, and ScaleFocus.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>