=Paper= {{Paper |id=Vol-2975/paper8 |storemode=property |title=Realising a Science Gateway for the Agri-food: the AGINFRAplus Experience |pdfUrl=https://ceur-ws.org/Vol-2975/paper8.pdf |volume=Vol-2975 |authors=Massimiliano Assante,Alice Boizet,Leonardo Candela,Donatella Castelli,Roberto Cirillo,Gianpaolo Coro,Enol Fernandez,Matthias Filter,Luca Frosini,George Kakaletris,Panagis Katsivelis,Rob Knapen,Lucio Lelii,Rob Lokers,Francesco Mangiacrapa,Pasquale Pagano,Giancarlo Panichi,Lyubomir Penev,Fabio Sinibaldi,Panagiotis Zervas }} ==Realising a Science Gateway for the Agri-food: the AGINFRAplus Experience== https://ceur-ws.org/Vol-2975/paper8.pdf
                          11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019



       Realising a Science Gateway for the Agri-food:
              the AGINFRA PLUS Experience
               M. Assante∗, A. Boizet† , L. Candela∗, D. Castelli∗ , R. Cirillo∗ , G. Coro∗ E. Fernández§,
           M. Filter¶ , L. Frosini∗ , G. Kakaletris‡ , P. Katsivelisk, M.J.R. Knapen∗∗, L. Lelii∗ , R.M. Lokers∗∗,
                    F. Mangiacrapa∗, P. Pagano∗, G. Panichi∗, L. Penev††, F. Sinibaldi∗ , P. Zervask
                    ∗ ISTI - National Research Council of Italy, Pisa, Italy – Email: {name.surname}@isti.cnr.it
                  † French National Institute for Agricultural Research, Paris, France – Email: alice.boizet@inra.fr
                              § EGI Foundation, Amsterdam. Neederlands – Email: enol.fernandez@egi.eu
               ¶ Federal Institute for Risk Assessment (BfR), Berlin, Germany – Email: matthias.filter@bfr.bund.de
                                    ‡ University of Athens, Athens, Greece – Email: gkakas@di.uoa.gr
                          k Agroknow, Athens, Greece – Email: {katsivelis.panagis, pzervas}@agroknow.com
          ∗∗ Wageningen University & Research, Wageningen, Neederlands – Email: {rob.knapen, rob.lokers}@wur.nl
                                     †† Pensoft Publishers, Sofia, Bulgaria – Email: penev@pensoft.net



   Abstract—The enhancements in IT solutions and the open                      tasks in a collaborative way. Such VREs are built by relying on
science movement are injecting changes in the practices dealing                an open and distributed platform (see Sec. II) providing a rich
with data collection, collation, processing and analytics, and                 array of services supporting all the phases of an open science
publishing in all the domains, including agri-food. However, in
implementing these changes one of the major issues faced by the                research lifecycle from data collection to data analytics and
agri-food researchers is the fragmentation of the “assets” to be               publication.
exploited when performing research tasks, e.g. data of interest                   AGINFRA PLUS is exploiting the VREs approach for
are heterogeneous and scattered across several repositories, the               three prominent agri-food research communities, namely: (i)
tools modellers rely on are diverse and often make use of limited              agro-climatic and economic modelling, focusing on use cases
computing capacity, the publishing practices are various and
rarely aim at making available the “whole story” with datasets,                related to crop modelling and crop phenology estimation, (ii)
processes, workflows. This paper presents the AGINFRA PLUS                     food safety risk assessment, focusing on use cases to support
endeavour to overcome these limitations by providing researchers               scientists in the multidisciplinary field of risk assessment and
in three designated communities with Virtual Research Environ-                 emerging risk identification, and (iii) food security, focusing
ments facilitating the use of the “assets” of interest and promote             on use cases related to high-throughput phenotyping to support
collaboration.
   Keywords—Virtual research environment; Agroclimatic mod-                    phenomics researchers to select the most suitable plant species
eling; Food safety risks assessment; Food security                             and varieties for specific environments.
                                                                                  The remainder of the paper is organised as follows. Sec. II
                          I. I NTRODUCTION                                     presents the major constituents of the AGINFRA PLUS plat-
                                                                               form. Sec. III discusses the exploitation scenarios developed
   The developments in information and communication tech-                     by each community and the benefits resulting from the use of
nologies, including big data availability and management,                      the platform. Finally, Sec. IV concludes the paper by reporting
web and cloud technologies, as well as open science related                    some future works.
practices are not yet fully embraced by Agriculture and Food
Science research domain [1], [2]. The fragmentation of “re-                               II. T HE AGINFRA PLUS P LATFORM
sources” of interest across several and heterogeneous “places”                    In order to support the AGINFRA PLUS communities, a
is certainly one of the major factors hindering this uptake                    comprehensive and feature rich platform has been developed
process, e.g. data are heterogeneous and scattered across                      and operated. An overall picture of such a platform aiming
several repositories, modelling tools and supporting systems                   at offering its facilities by the as-a-Service delivery model is
are diverse, the amount of available computing capacity varies                 given in Fig. 1.
a lot across teams and laboratories.                                              Such a platform follows the system of systems approach
   The AGINFRA PLUS project has been set up to develop                         [4], where the constituent systems offer “resources” (namely
an innovative approach in Agri-food digital science prac-                      services) for the implementation of the resulting system fa-
tices aiming at overcoming the limitations stemming from                       cilities. In particular, such a platform aggregates “resources”
the above settings by leveraging on existing e-Infrastructures                 from “domain agnostic” service providers (e.g. D4Science [5],
and services. In particular, AGINFRA PLUS promotes the                         EGI [6], OpenAIRE [7]) as well as from community-specific
exploitation of Virtual Research Environments (VREs) [3] to                    ones (e.g. AgroDataCube [8], AGROVOC [9], RAKIP model
provide designated communities with seamless access to the                     repository [10]) to build a unifying space where the aggregated
data, services, and facilities they need to perform their research             resources can be exploited via VREs [11]. This system of


Copyright © 2021 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
                      11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019




                                              Fig. 1. AGINFRA PLUS Platform Architecture



systems approach is enabled by D4Science. D4Science is                   data, i.e. a user can determine the rules for transforming
at the heart of the overall platform. In fact, this service              the data into triples using arbitrary schemas and ontologies.
provider offers the core services to implement the resulting             In practice, it supports the building of an RDF skeleton for
platform, namely: (a) the AGINFRA PLUS gateway [12],                     defining how cell values will be translated in RDF. It is based
realising the single access point to the rest of the platform (see       on the open-source OpenRefine tool [16], a powerful tool
Fig. 2); (b) the authentication and authorisation infrastructure,        for data cleaning and transformation including a plug-in for
enabling users to seamlessly access the aggregated services              RDF-isation; (d) an ontology visualisation service supporting
once managed to log in the gateway; (c) the shared workspace,            users to upload and / or import ontologies and visualise the
for storing, organising and sharing any version of a research            graph corresponding to the ontology. Classes and instances are
artefact [13], including dataset and model implementation; (d)           represented as circular nodes and properties are represented as
the social networking area enabling collaborative and open               edges between these nodes. A side panel giving information on
discussions on any topic and disseminating information of                entity as defined in the ontology completes the offering. It is
interest for the community, e.g. the availability of a research          based on WebVOWL [17], a web-based tool for the interactive
outcome [13]; (e) the overall catalogue recording the assets             visualisation of ontologies; (e) an ontology alignment service
worth being published thus to make it possible for others to             facilitating users in establishing mapping between two diverse
be informed and make use of these assets [13].                           ontologies or thesauri. It is based on YAM++ [18], a web tool
   These basic facilities are complemented by services for the           proved to be effective and scalable in ontology matching tasks.
semantic-oriented management of data, data analytics, data
visualization, and publishing.                                           B. Data Analytics Solutions
                                                                            The AGINFRA PLUS analytics facilities offer a rich array
A. Semantic Data Management Solutions
                                                                         of services for the challenging task of big data analytics [19].
   The AGINFRA PLUS data & semantics facilities offer                    The supported facilities include: (a) a data analytics plat-
an array of services for managing semantic resources (e.g.               form to execute analytics tasks either by relying on methods
ontologies, thesauri, vocabularies) and for benefitting from             provided by the user or by others [20]. It is endowed with
such resources in tasks related with data management. The                importing and sharing facilities for analytics methods imple-
supported facilities include: (a) an ontology engineering ser-           mented in heterogeneous forms including R, Java, Phyton, and
vice for creating, editing and managing semantic resources               KNIME [21] (largely used by the food safety community). The
and, at the same time, catering for their collaborative design,          platform enacts tasks execution by a distributed and hybrid
editing and management. It is based on VocBench [14], a web-             computing infrastructure including EGI resources. Moreover,
based platform for managing OWL ontologies, SKOS thesauri                one of the worth highlighting feature of this platform is its
and RDF datasets; (b) a semantic linking service supporting the          open science-friendliness. All the analytics methods integrated
establishment of semantic links between data items belonging             in it are exposed by a standard protocol (the OGC WPS
to different datasets and different sources. It is based on Silk         protocol) clients can use to get informed on available methods
[15], a web-based platform enabling users to manage diverse              as well as to start processes. monitor their execution and
datasources, linking tasks and transformation tasks; (c) a data          access results. Every analytics task performed by the platform
transformation service promoting the RDF-isation of tabular              automatically produces a provenance record catering for the



                                                                     2
                        11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019




                                                Fig. 2. AGINFRA PLUS Gateway: the Dashboard



repeatability of the task; (b) an RStudio-based development                   and its results. The final goal is to provide the reader with an
environment for R enabling to perform statistical computing                   effective representation of a research activity and its results
tasks in the cloud. The environment provide its users with a                  thus to enable its repeatability. The supported facilities include:
powerful IDE including a console, a source code editor that                   (a) a graphs management workbench for creating several
supports direct code execution, as well as tools for plotting,                typologies of interactive graphs ranging from generic ones
history, debugging and workspace management. (c) a Jupyter-                   (e.g. Spline, Scatter, Bar, Line, Step, Pie, Doughnut, Polar) to
based notebook environment for documenting and recording                      very specific ones (e.g. graphs reporting the height of plants
analytics processes [22]. Every notebook is a rich document                   across time with values and images). The platform provide
that contain live code, equations, visualizations and narrative               users with facilities to import a dataset of interest, to define
text aiming at capturing a research activity; (d) a Galaxy-based              how its content has to be used to produce the graph of interest,
workflow management workbench for combining several ana-                      and to share the produced graphs; (b) a mind map workbench
lytics tasks into workflows [23]. In practice, if offers a means              for managing this typology of diagrams; (c) a network visu-
to build multi-step computational analyses by specifying what                 alisation for creating visualisations aiming at highlighting the
data to operate on, what steps to take, and what order to do                  connections among the entities of a connected graph; (d) a
these steps in.                                                               catalogue-based publishing platform to disseminate artefacts
   All these platforms and environments are nicely integrated                 according to the FAIR principles [24]. The latter platform [13]
each other as well as are integrated with the rest of services                makes it possible to customise, per domain, the typologies
offered by AGINFRA PLUS. For instance, every method                           of items to be published by carefully defining their metadata
integrated in the data analytics platform can be easily executed              (attributes, possible values, constraints) and some management
by a Jupyter notebook or by a Galaxy workflow. All these                      triggers (e.g. what values should be transformed in tags, what
tools are equipped with solutions facilitating the access to the              should lead to groups). Moreover, catalogue items are expected
workspace content thus to make use of it during the processing                to be endowed with “resources” representing the payload of
steps, e.g. to use files as inputs or to store results. It is straight-       any item. Therefore, by using catalogue item resources it is
forward to publish every analytics process implemented by                     possible, for example, to execute a model, to access a dataset,
these tools into the catalogue to share it with coworkers.                    to visualize a graph; (e) a research community dashboard
                                                                              realising a domain specific access point to search for content of
C. Data Visualization and Publishing Solutions                                interest. This is based on the OpenAIRE specific service [25]
   The AGINFRA PLUS data visualization & publishing fa-                       enabling to publish research products and interlink them with
cilities provide users with feature-rich and flexible solutions               the OpenAIRE scholarly communication cloud. (f ) a scholarly
for developing representations (e.g. graphs) out of datasets and              publishing platform integrated with Pensoft infrastructure [26]
publishing “research objects” documenting a research activity                 to enable the creation of innovative papers including datasets



                                                                          3
                      11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019


and methods hosted by the AGINFRA PLUS platform. By                     Therefore recent efforts in the use case have been focussing
relying on this platform, users are allowed to mix the narrative        on improving these integration capabilities and on providing
of a traditional paper with links aiming at giving effective            better-connected prototypes for both activities, supporting the
access to the digital version of the research products.                 full research working process. For example by creating a
                                                                        dashboard that visualises crop parcels, the input data for crop
               III. E XPLOITATION SCENARIOS
                                                                        model simulations (crop, soil and weather information), and
A. Agroclimatic Modeling                                                on-the-fly calculated simulation results such as leaf area index
   The objective is to set-up and evaluate an AGINFRA PLUS              and total biomass produced.
VRE for use by agro-climatic researchers to perform crop                   The Virtual Research Environment developed for supporting
modelling related work. To guide the selection and develop-             this scenario is available at https://aginfra.d4science.org/web/
ment of tools to be included in the VRE, two typical research           agroclimaticmodelling.
activities were selected: (i) performing crop model simulations
at scale; and (ii) explorative modelling focussing on crop phe-         B. Food Safety Risk Assessment
nology studies. Based on data availability both have a strong              In the domain of food safety modelling two exploitation
focus on The Netherlands as study area (using AgroDataCube              scenarios were identified where scientific data analysis work-
[8] as input), but the approaches can be extended to other              flows and software based resources for knowledge sharing and
regions once sufficient data is collected and a suitable crop           integration are of extraordinary importance. Both scenarios
model has been added to the VRE.                                        nicely complement the activities the community is promoting
   Initially the well-know WOFOST model [27] has been                   to harmonise the knowledge produced [28].
integrated so that it can be executed as tasks in the AGINFRA              The DEMETER scenario is aiming at developing a working
PLUS data analytics platform (see Sec. II-B). DataMiner                 environment supporting the early identification of issues in the
makes algorithms available as Web Processing Service (WPS,              food (and feed) chain. This scenario largely build upon the
a standard from the Open GeoSpatial Consortium, OGC).                   workspace, the data analytics and the catalogue to demonstrate
Using these facilities a ‘worker’ process has been implemented          how KNIME-based data mining workflows can be efficiently
that can run large batches (1000 - 10000) of crop simula-               shared and applied from within the VRE.
tions on the distributed computing infrastructure behind the               The RAKIP scenario aims at providing risk assessors and
platform, and a ‘scheduler’ process that can divide the total           risk modellers with an environment supporting their efforts to
workload over all available compute nodes (currently between            share their knowledge (data, mathematical model, simulation
6 to 10), and collect all simulations results. The ‘workload’ for       results) in a harmonized way. A distinguishing feature of
example might consist of running the crop simulations for all           this environment is a community-driven food safety model
crop parcels of one or more years in The Netherlands, about             repository, that contains mathematical models from the area
400,000 crop simulations per year, studying effects of input            of predictive microbial modelling and quantitative microbial
parameter variations, such as temperature sums or precipitation         risk assessment (QMRA). This repository builds upon the
amounts, which multiplies the total crop simulations needing            FSK-Lab [29], i.e. a community standard to homogenise the
to be performed.                                                        representation and packaging of all relevant data, metadata
   A second activity that is examined is the use of explorative         and model scripts in a machine-readable format. This is
modelling for the estimation of crop phenology characteristics,         an extension of the KNIME platform, one of the platforms
using available agronomic data, combined with crop develop-             underlying the data analytics.
ment indicators (e.g. the NDVI vegetation index), derived from             The AGINFRA PLUS platform support these scenarios by
remote sensing data. This activity uses the AGINFRA PLUS                providing: (a) facilities for developing the ontology underlying
analytics facilities such as Jupyter Notebooks and RStudio, to          the FSK-Lab solution for models representation (VocBench,
experiment with agronomic data analytics. The aim is to test            see Sec. II-A); (b) two specific processes integrated into the
such analytics, providing insight in critical crop development          data analytics platform to respectively support the publishing
indicators, and to convert these into algorithms deployed               of a model into the catalogue and the execution of any model;
as DataMiner processes on the VRE to run them at scale.                 (c) a catalogue where the models are published according
Results can then be used to more accurately estimate regional           to the community ontology and endowed each with three
crop yields, using long-term agronomic statistics and yield             actionable resources enabling users to respectively download
prediction systems.                                                     the model, perform a model simulation by using the default
   Early stage evaluation results from piloting both agro-              parameters, perform a model simulation by tuning the pa-
climatic modelling activities indicated that the VRE already            rameters; (d) a mind map development and dissemination
is regarded as being well equipped for collaborative research.          solution facilitating the communication among the members;
At that point (about one year ago) there were however                   (e) a journal-based approach for publishing the models. The
reservations concerning the ability of the VRE to support               Food Modeling Journal1 has been designed and launched to
full agro-climatic modelling workflows, due to some limi-               support the needs emerging in this community. It promotes
tations regarding the integration of the different processing,
analytics and visualisation components available in the VRE.              1 https://fmj.pensoft.net/




                                                                    4
                        11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019


the publishing of Models, Data analytics, Applied study, Data                                    IV. C ONCLUSION
paper, and Software description. Thanks to the integration                 This paper presented the AGINFRA PLUS platform, a
of the publishing platform into the VREs (see Sec. II-C)                science gateway providing the Agri-food community with a
it is straightforward to produce papers linking the available           rich array of services oriented to promote the implementa-
artifacts, e.g. the models in their actionable form.                    tion of open science practices. Such a platform is currently
   The Virtual Research Environments developed for support-             supporting three designated communities dealing with crops
ing these scenarios are available at https://aginfra.d4science.         simulation, food safety risk assessment, and high-throughput
org/web/demeter and https://aginfra.d4science.org/web/rakip             plant phenotyping scenarios.
portal.                                                                    The platform is bringing into these communities and their
                                                                        working practices a number of benefits including (a) the
C. Food Security                                                        simplicity for coworkers to perform collaborative work, e.g.
                                                                        the workspace is a working area users can count on to
   The Food Security Community is focusing on a high-                   collaborate, the social networking is a means to have informed
throughput plant phenotyping scenario. This scenario can help           dialogues; (b) the easiness to share results of any form
to select crop varieties that better adapt to global changes            within and across the boundaries of their communities and
in order to respond to the food security challenges. High-              the platform itself, e.g. the catalogue is a valuable service
throughput phenotyping produces a large amount of data                  for disseminating research artefacts and enable users to access
which need to be integrated and analysed right away. For                them, the integration with the OpenAIRE dashboard and the
example, in a greenhouse platform, a lot of images of plants            scholarly communication platform reduces the gaps with the
are taken: 13 images per plants per day are taken in the                scholarly communication domain; (c) the attention dedicated
Montpellier platform which works on 1600 plants (more than              to ease the flowing of existing artefacts into the platform
20,000 images per day). Field platforms produce and need                thus to reduce fragmentation and facilitate their reuse, e.g. the
a lot of images including UAV or satellite. High-throughput             plethora of programming languages and approaches supported
phenotyping platforms produce complex data (sensors data,               by the analytics facilities make it possible to easily integrate
human reading) at different scales (e.g. population, individuals,       almost any existing analytics method, the array of solutions
molecular).                                                             for ontology management facilitate their reuse.
   The phenomics community needs tools to easily access                    Overall, the AGINFRA PLUS platform is currently serving
to large datasets and to be able to visualize and analyse               hundreds of users (more than 340 in Feb. 2019) by 13 active
them. Moreover, sharing data, analytics process and results is          VREs. In the coming months these figures are going to
essential. The objective of this use case is to develop a VRE           improve because the project will enter into the community val-
for phenomics researchers where these users: have access to             idation and uptake phase. In the period Mar. 2018 - Feb. 2019
relevant ontologies; collaborate on building and share semantic         the users served by this platform and its VREs performed:
resources; have access to phenomics platforms data from the             a total of 24,439 working sessions, with an average of circa
information system OpenSILEX-PHIS [30]; visualize data;                 2,036 sessions per month; a total of 1,959 social interactions,
import and run data analytics scripts in different languages            with an average of circa 163 interactions per month; a total
(R, Python, etc); import or update and run data analytics               of 1,842 analytics tasks, with an average of circa 153 tasks
workflows (KNIME, Galaxy); share results and work with                  per month; a total of 387 items have been published into
other users.                                                            the catalogue including models, research objects, methods,
   The first evaluation results of the Food Security VRE                services, terms, and datasets.
indicated that the VRE is useful for collaborative work. The               Future developments includes the development of a cata-
diversity of tools that are available has also shown interest           logue supporting semantic queries, the development of tools
from the users. However, there were some reservations on                easing the discovery and access to geospatial datasets, the
the integration of these which made difficult the execution of          development of recommender systems, the development of
certain data analysis workflows. Another concern on big data            tools supporting the identification of suitable licences for the
manipulation and data access has also been noted. Considering           produced artifacts.
this, recent work has been made to improve these integration                                   ACKNOWLEDGMENT
capabilities in order to provide better connected tools. Web
                                                                          This work has received funding from the European Union’s
Services based on the Breeding API standards2 had also been
                                                                        Horizon 2020 research and innovation programme under AG-
implemented into the OpenSILEX-PHIS system in order to
                                                                        INFRA PLUS project (grant agreement No. 731001).
easily access phenotyping data in the VRE.
   The Virtual Research Environment developed for supporting                                       R EFERENCES
this scenario is available at https://aginfra.d4science.org/web/         [1] J. W. Jones, J. M. Antle, B. Basso, K. J. Boote, R. T. Conant,
foodsecurity.                                                                I. Foster, H. C. J. Godfray, M. Herrero, R. E. Howitt, S. Janssen,
                                                                             B. A. Keating, R. Munoz-Carpena, C. H. Porter, C. Rosenzweig,
                                                                             and T. R. Wheeler, “Toward a new generation of agricultural system
  2 https://brapi.org                                                        data, models, and knowledge products: State of agricultural systems




                                                                    5
                          11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019


     science,” Agricultural Systems, vol. 155, pp. 269 – 288, 2017. [Online].       [22] F. Perez and B. E. Granger, “IPython: A system for interactive scientific
     Available: https://doi.org/10.1016/j.agsy.2016.09.021                               computing,” Computing in Science & Engineering, vol. 9, no. 3, pp.
 [2] e-ROSA Consortium, “A roadmap for a pan-european e-infrastructure                   21–29, 2007.
     for open science in agricultural and food sciences,” e-ROSA Roadmap,           [23] J. Goecks, A. Nekrutenko, and J. Taylor, “Galaxy: a comprehensive
     2018.                                                                               approach for supporting accessible, reproducible, and transparent com-
 [3] L. Candela, D. Castelli, and P. Pagano, “Virtual research environments:             putational research in the life sciences,” Genome Biology, vol. 11, no. 8,
     an overview and a research agenda,” Data Science Journal, vol. 12, pp.              p. R86, Aug 2010.
     GRDI75–GRDI81, 2013.                                                           [24] M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, G. Appleton,
 [4] M. W. Maier, “Architecting principles for systems-of-systems,” INCOSE               M. Axton, A. Baak, N. Blomberg, J.-W. Boiten, L. B. da Silva Santos,
     International Symposium, vol. 6, no. 1, pp. 565–573, 1996. [Online].                P. E. Bourne, J. Bouwman, A. J. Brookes, T. Clark, M. Crosas, I. Dillo,
     Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/j.2334-5837.             O. Dumon, S. Edmunds, C. T. Evelo, R. Finkers, A. Gonzalez-Beltran,
     1996.tb02054.x                                                                      A. J. G. Gray, P. Groth, C. Goble, J. S. Grethe, J. Heringa, P. A. C.
                                                                                         ’t Hoen, R. Hooft, T. Kuhn, R. Kok, J. Kok, S. J. Lusher, M. E.
 [5] D4Science Consortium, “D4Science: an e-infrastructure supporting vir-
                                                                                         Martone, A. Mons, A. L. Packer, B. Persson, P. Rocca-Serra, M. Roos,
     tual research environments,” www.d4science.org.
                                                                                         R. van Schaik, S.-A. Sansone, E. Schultes, T. Sengstag, T. Slater,
 [6] EGI Foundation, “EGI e-infrastructure,” www.egi.eu.                                 G. Strawn, M. A. Swertz, M. Thompson, J. van der Lei, E. van
 [7] OpenAIRE Consortium, “OpenAIRE: the european scholarly communi-                     Mulligen, J. Velterop, A. Waagmeester, P. Wittenburg, K. Wolstencroft,
     cation data infrastructure,” www.openaire.eu.                                       J. Zhao, and B. Mons, “The FAIR guiding principles for scientific data
 [8] H. Janssen, S. Janssen, M. Knapen, W. Meijninger, Y. v. Randen, I. l.               management and stewardship,” Scientific Data, vol. 3, p. 160018 EP,
     Riviere, and G. Roerink, “AgroDataCube: A big open data collection                  2016. [Online]. Available: http://dx.doi.org/10.1038/sdata.2016.18
     for agri-food applications,” agrodatacube.wur.nl, 2018.                        [25] P. Prı́ncipe, A. Bardi, A. Vieira, P. Manghi, M. Baglioni, and N. Retberg,
 [9] C. Caracciolo, A. Stellato, A. Morshed, G. Johannsen, S. Rajbhandari,               “Openaire dashboard for research communities: Enabling open science
     Y. Jaques, and J. Keizer, “The AGROVOC linked dataset,” Semantic                    publishing for research communities and research infrastructures,” Poster
     Web, vol. 4, no. 3, pp. 341–348, 2013.                                              presented at the Open Science Conference 2019, Berlin, Germany, 19-20
[10] German Federal Institute for Risk Assessment, “Foodrisk-labs,”                      March 2019, 2019.
     https://foodrisklabs.bfr.bund.de/foodrisk-labs/.                               [26] L. Penev, “From open access to open science from the viewpoint of a
[11] M. Assante, L. Candela, D. Castelli, R. Cirilllo, G. Coro, L. Frosini,              scholarly publisher,” Research Ideas and Outcomes, vol. 3, p. e12265,
     L. Lelii, F. Mangiacrapa, V. Marioli, P. Pagano, G. Panichi,                        2017. [Online]. Available: https://doi.org/10.3897/rio.3.e12265
     C. Perciante, and F. Sinibaldi, “The gcube system: Delivering virtual          [27] A. de Wit, H. Boogaard, D. Fumagalli, S. Janssen, R. Knapen,
     research environments as-a-service,” Future Generation Computer                     D. van Kraalingen, I. Supit, R. van der Wijngaart, and K. van
     Systems, vol. 95, no. n.a., pp. 445–453, 2019. [Online]. Available:                 Diepen, “25 years of the wofost cropping systems model,” Agricultural
     http://www.sciencedirect.com/science/article/pii/S0167739X17328364                  Systems, vol. 168, pp. 154 – 167, 2019. [Online]. Available:
[12] AGINFRA Consortium, “The AGINFRA gateway,” https://aginfra.                         http://www.sciencedirect.com/science/article/pii/S0308521X17310107
     d4science.org/.                                                                [28] L. U. Haberbeck, C. Plaza-Rodrı́guez, V. Desvignes, P. Dalgaard,
[13] M. Assante, L. Candela, D. Castelli, G. Coro, F. Mangiacrapa, P. Pagano,            M. Sanaa, L. Guillier, M. Nauta, and M. Filter, “Harmonized terms,
     and P. Costantino, “Enacting open science by gcube,” in Proceedings of              concepts and metadata for microbiological risk assessment models: The
     the 9th International Workshop on Science Gateways, 2018.                           basis for knowledge integration and exchange,” Microbial Risk Analysis,
                                                                                         vol. 10, pp. 3 – 12, 2018, special issue on 10th International Conference
[14] A. Stellato, S. Rajbhandari, A. Turbati, M. Fiorelli, C. Caracciolo,
                                                                                         on Predictive Modelling in Food: Interdisciplinary Approaches and
     T. Lorenzetti, J. Keizer, and M. T. Pazienza, “Vocbench: A web
                                                                                         Decision-Making Tools in Microbial Risk Analysis. [Online]. Available:
     application for collaborative development of multilingual thesauri,” in
                                                                                         http://www.sciencedirect.com/science/article/pii/S2352352218300100
     The Semantic Web. Latest Advances and New Domains, F. Gandon,
                                                                                    [29] M. de Alba Aparicio, T. Buschhardt, A. Swaid, L. Valentin, O. Mesa-
     M. Sabou, H. Sack, C. d’Amato, P. Cudré-Mauroux, and A. Zimmer-
                                                                                         Varona, T. Günther, C. Plaza-Rodriguez, and M. Filter, “Fsk-lab – an
     mann, Eds. Cham: Springer International Publishing, 2015, pp. 38–53.
                                                                                         open source food safety model integration tool,” Microbial Risk Analysis,
[15] J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov, “Silk – a link                      vol. 10, pp. 13 – 19, 2018, special issue on 10th International Conference
     discovery framework for the web of data,” in Proceedings of the                     on Predictive Modelling in Food: Interdisciplinary Approaches and
     Linked Data on the Web Workshop (LDOW2009), Madrid, Spain, April                    Decision-Making Tools in Microbial Risk Analysis. [Online]. Available:
     20, 2009, CEUR Workshop Proceedings, 2009. [Online]. Available:                     http://www.sciencedirect.com/science/article/pii/S2352352218300136
     http://ceur-ws.org/Vol-538/ldow2009 paper13.pdf                                [30] P. Neveu, A. Tireau, N. Hilgert, V. Nègre, J. Mineau-Cesari,
[16] R. Verborgh and M. De Wilde, Using OpenRefine. Packt Publishing,                    N. Brichet, R. Chapuis, I. Sanchez, C. Pommier, B. Charnomordic,
     2013.                                                                               F. Tardieu, and L. Cabrera-Bosquet, “Dealing with multi-source
[17] S. Lohmann, V. Link, E. Marbach, and S. Negru, “WebVOWL: Web-                       and multi-scale information in plant phenomics: the ontology-
     based visualization of ontologies,” in Knowledge Engineering and                    driven phenotyping hybrid information system,” New Phytologist,
     Knowledge Management, P. Lambrix, E. Hyvönen, E. Blomqvist, V. Pre-                vol. 221, no. 1, pp. 588–601, 2019. [Online]. Available: https:
     sutti, G. Qi, U. Sattler, Y. Ding, and C. Ghidini, Eds. Cham: Springer              //nph.onlinelibrary.wiley.com/doi/abs/10.1111/nph.15385
     International Publishing, 2015, pp. 154–158.
[18] D. Ngo and Z. Bellahsene, “Overview of yam++—(not) yet
     another matcher for ontology alignment task,” Journal of Web
     Semantics, vol. 41, pp. 30 – 49, 2016. [Online]. Available:
     http://www.sciencedirect.com/science/article/pii/S1570826816300464
[19] S. Khalifa, Y. Elshater, K. Sundaravarathan, A. Bhat, P. Martin,
     F. Imam, D. Rope, M. Mcroberts, and C. Statchuk, “The six pillars
     for building big data analytics ecosystems,” ACM Comput. Surv.,
     vol. 49, no. 2, pp. 33:1–33:36, Aug. 2016. [Online]. Available:
     http://doi.acm.org/10.1145/2963143
[20] G. Coro, G. Panichi, P. Scarponi, and P. Pagano, “Cloud computing in a
     distributed e-infrastructure using the web processing service standard,”
     Concurrency and Computation: Practice and Experience, vol. 29, no. 18,
     p. e4219, 2017.
[21] M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kötter,
     T. Meinl, P. Ohl, K. Thiel, and B. Wiswedel, “Knime - the
     konstanz information miner: Version 2.0 and beyond,” SIGKDD Explor.
     Newsl., vol. 11, no. 1, pp. 26–31, Nov. 2009. [Online]. Available:
     http://doi.acm.org/10.1145/1656274.1656280




                                                                                6