11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019 Realising a Science Gateway for the Agri-food: the AGINFRA PLUS Experience M. Assante∗, A. Boizet† , L. Candela∗, D. Castelli∗ , R. Cirillo∗ , G. Coro∗ E. Fernández§, M. Filter¶ , L. Frosini∗ , G. Kakaletris‡ , P. Katsivelisk, M.J.R. Knapen∗∗, L. Lelii∗ , R.M. Lokers∗∗, F. Mangiacrapa∗, P. Pagano∗, G. Panichi∗, L. Penev††, F. Sinibaldi∗ , P. Zervask ∗ ISTI - National Research Council of Italy, Pisa, Italy – Email: {name.surname}@isti.cnr.it † French National Institute for Agricultural Research, Paris, France – Email: alice.boizet@inra.fr § EGI Foundation, Amsterdam. Neederlands – Email: enol.fernandez@egi.eu ¶ Federal Institute for Risk Assessment (BfR), Berlin, Germany – Email: matthias.filter@bfr.bund.de ‡ University of Athens, Athens, Greece – Email: gkakas@di.uoa.gr k Agroknow, Athens, Greece – Email: {katsivelis.panagis, pzervas}@agroknow.com ∗∗ Wageningen University & Research, Wageningen, Neederlands – Email: {rob.knapen, rob.lokers}@wur.nl †† Pensoft Publishers, Sofia, Bulgaria – Email: penev@pensoft.net Abstract—The enhancements in IT solutions and the open tasks in a collaborative way. Such VREs are built by relying on science movement are injecting changes in the practices dealing an open and distributed platform (see Sec. II) providing a rich with data collection, collation, processing and analytics, and array of services supporting all the phases of an open science publishing in all the domains, including agri-food. However, in implementing these changes one of the major issues faced by the research lifecycle from data collection to data analytics and agri-food researchers is the fragmentation of the “assets” to be publication. exploited when performing research tasks, e.g. data of interest AGINFRA PLUS is exploiting the VREs approach for are heterogeneous and scattered across several repositories, the three prominent agri-food research communities, namely: (i) tools modellers rely on are diverse and often make use of limited agro-climatic and economic modelling, focusing on use cases computing capacity, the publishing practices are various and rarely aim at making available the “whole story” with datasets, related to crop modelling and crop phenology estimation, (ii) processes, workflows. This paper presents the AGINFRA PLUS food safety risk assessment, focusing on use cases to support endeavour to overcome these limitations by providing researchers scientists in the multidisciplinary field of risk assessment and in three designated communities with Virtual Research Environ- emerging risk identification, and (iii) food security, focusing ments facilitating the use of the “assets” of interest and promote on use cases related to high-throughput phenotyping to support collaboration. Keywords—Virtual research environment; Agroclimatic mod- phenomics researchers to select the most suitable plant species eling; Food safety risks assessment; Food security and varieties for specific environments. The remainder of the paper is organised as follows. Sec. II I. I NTRODUCTION presents the major constituents of the AGINFRA PLUS plat- form. Sec. III discusses the exploitation scenarios developed The developments in information and communication tech- by each community and the benefits resulting from the use of nologies, including big data availability and management, the platform. Finally, Sec. IV concludes the paper by reporting web and cloud technologies, as well as open science related some future works. practices are not yet fully embraced by Agriculture and Food Science research domain [1], [2]. The fragmentation of “re- II. T HE AGINFRA PLUS P LATFORM sources” of interest across several and heterogeneous “places” In order to support the AGINFRA PLUS communities, a is certainly one of the major factors hindering this uptake comprehensive and feature rich platform has been developed process, e.g. data are heterogeneous and scattered across and operated. An overall picture of such a platform aiming several repositories, modelling tools and supporting systems at offering its facilities by the as-a-Service delivery model is are diverse, the amount of available computing capacity varies given in Fig. 1. a lot across teams and laboratories. Such a platform follows the system of systems approach The AGINFRA PLUS project has been set up to develop [4], where the constituent systems offer “resources” (namely an innovative approach in Agri-food digital science prac- services) for the implementation of the resulting system fa- tices aiming at overcoming the limitations stemming from cilities. In particular, such a platform aggregates “resources” the above settings by leveraging on existing e-Infrastructures from “domain agnostic” service providers (e.g. D4Science [5], and services. In particular, AGINFRA PLUS promotes the EGI [6], OpenAIRE [7]) as well as from community-specific exploitation of Virtual Research Environments (VREs) [3] to ones (e.g. AgroDataCube [8], AGROVOC [9], RAKIP model provide designated communities with seamless access to the repository [10]) to build a unifying space where the aggregated data, services, and facilities they need to perform their research resources can be exploited via VREs [11]. This system of Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019 Fig. 1. AGINFRA PLUS Platform Architecture systems approach is enabled by D4Science. D4Science is data, i.e. a user can determine the rules for transforming at the heart of the overall platform. In fact, this service the data into triples using arbitrary schemas and ontologies. provider offers the core services to implement the resulting In practice, it supports the building of an RDF skeleton for platform, namely: (a) the AGINFRA PLUS gateway [12], defining how cell values will be translated in RDF. It is based realising the single access point to the rest of the platform (see on the open-source OpenRefine tool [16], a powerful tool Fig. 2); (b) the authentication and authorisation infrastructure, for data cleaning and transformation including a plug-in for enabling users to seamlessly access the aggregated services RDF-isation; (d) an ontology visualisation service supporting once managed to log in the gateway; (c) the shared workspace, users to upload and / or import ontologies and visualise the for storing, organising and sharing any version of a research graph corresponding to the ontology. Classes and instances are artefact [13], including dataset and model implementation; (d) represented as circular nodes and properties are represented as the social networking area enabling collaborative and open edges between these nodes. A side panel giving information on discussions on any topic and disseminating information of entity as defined in the ontology completes the offering. It is interest for the community, e.g. the availability of a research based on WebVOWL [17], a web-based tool for the interactive outcome [13]; (e) the overall catalogue recording the assets visualisation of ontologies; (e) an ontology alignment service worth being published thus to make it possible for others to facilitating users in establishing mapping between two diverse be informed and make use of these assets [13]. ontologies or thesauri. It is based on YAM++ [18], a web tool These basic facilities are complemented by services for the proved to be effective and scalable in ontology matching tasks. semantic-oriented management of data, data analytics, data visualization, and publishing. B. Data Analytics Solutions The AGINFRA PLUS analytics facilities offer a rich array A. Semantic Data Management Solutions of services for the challenging task of big data analytics [19]. The AGINFRA PLUS data & semantics facilities offer The supported facilities include: (a) a data analytics plat- an array of services for managing semantic resources (e.g. form to execute analytics tasks either by relying on methods ontologies, thesauri, vocabularies) and for benefitting from provided by the user or by others [20]. It is endowed with such resources in tasks related with data management. The importing and sharing facilities for analytics methods imple- supported facilities include: (a) an ontology engineering ser- mented in heterogeneous forms including R, Java, Phyton, and vice for creating, editing and managing semantic resources KNIME [21] (largely used by the food safety community). The and, at the same time, catering for their collaborative design, platform enacts tasks execution by a distributed and hybrid editing and management. It is based on VocBench [14], a web- computing infrastructure including EGI resources. Moreover, based platform for managing OWL ontologies, SKOS thesauri one of the worth highlighting feature of this platform is its and RDF datasets; (b) a semantic linking service supporting the open science-friendliness. All the analytics methods integrated establishment of semantic links between data items belonging in it are exposed by a standard protocol (the OGC WPS to different datasets and different sources. It is based on Silk protocol) clients can use to get informed on available methods [15], a web-based platform enabling users to manage diverse as well as to start processes. monitor their execution and datasources, linking tasks and transformation tasks; (c) a data access results. Every analytics task performed by the platform transformation service promoting the RDF-isation of tabular automatically produces a provenance record catering for the 2 11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019 Fig. 2. AGINFRA PLUS Gateway: the Dashboard repeatability of the task; (b) an RStudio-based development and its results. The final goal is to provide the reader with an environment for R enabling to perform statistical computing effective representation of a research activity and its results tasks in the cloud. The environment provide its users with a thus to enable its repeatability. The supported facilities include: powerful IDE including a console, a source code editor that (a) a graphs management workbench for creating several supports direct code execution, as well as tools for plotting, typologies of interactive graphs ranging from generic ones history, debugging and workspace management. (c) a Jupyter- (e.g. Spline, Scatter, Bar, Line, Step, Pie, Doughnut, Polar) to based notebook environment for documenting and recording very specific ones (e.g. graphs reporting the height of plants analytics processes [22]. Every notebook is a rich document across time with values and images). The platform provide that contain live code, equations, visualizations and narrative users with facilities to import a dataset of interest, to define text aiming at capturing a research activity; (d) a Galaxy-based how its content has to be used to produce the graph of interest, workflow management workbench for combining several ana- and to share the produced graphs; (b) a mind map workbench lytics tasks into workflows [23]. In practice, if offers a means for managing this typology of diagrams; (c) a network visu- to build multi-step computational analyses by specifying what alisation for creating visualisations aiming at highlighting the data to operate on, what steps to take, and what order to do connections among the entities of a connected graph; (d) a these steps in. catalogue-based publishing platform to disseminate artefacts All these platforms and environments are nicely integrated according to the FAIR principles [24]. The latter platform [13] each other as well as are integrated with the rest of services makes it possible to customise, per domain, the typologies offered by AGINFRA PLUS. For instance, every method of items to be published by carefully defining their metadata integrated in the data analytics platform can be easily executed (attributes, possible values, constraints) and some management by a Jupyter notebook or by a Galaxy workflow. All these triggers (e.g. what values should be transformed in tags, what tools are equipped with solutions facilitating the access to the should lead to groups). Moreover, catalogue items are expected workspace content thus to make use of it during the processing to be endowed with “resources” representing the payload of steps, e.g. to use files as inputs or to store results. It is straight- any item. Therefore, by using catalogue item resources it is forward to publish every analytics process implemented by possible, for example, to execute a model, to access a dataset, these tools into the catalogue to share it with coworkers. to visualize a graph; (e) a research community dashboard realising a domain specific access point to search for content of C. Data Visualization and Publishing Solutions interest. This is based on the OpenAIRE specific service [25] The AGINFRA PLUS data visualization & publishing fa- enabling to publish research products and interlink them with cilities provide users with feature-rich and flexible solutions the OpenAIRE scholarly communication cloud. (f ) a scholarly for developing representations (e.g. graphs) out of datasets and publishing platform integrated with Pensoft infrastructure [26] publishing “research objects” documenting a research activity to enable the creation of innovative papers including datasets 3 11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019 and methods hosted by the AGINFRA PLUS platform. By Therefore recent efforts in the use case have been focussing relying on this platform, users are allowed to mix the narrative on improving these integration capabilities and on providing of a traditional paper with links aiming at giving effective better-connected prototypes for both activities, supporting the access to the digital version of the research products. full research working process. For example by creating a dashboard that visualises crop parcels, the input data for crop III. E XPLOITATION SCENARIOS model simulations (crop, soil and weather information), and A. Agroclimatic Modeling on-the-fly calculated simulation results such as leaf area index The objective is to set-up and evaluate an AGINFRA PLUS and total biomass produced. VRE for use by agro-climatic researchers to perform crop The Virtual Research Environment developed for supporting modelling related work. To guide the selection and develop- this scenario is available at https://aginfra.d4science.org/web/ ment of tools to be included in the VRE, two typical research agroclimaticmodelling. activities were selected: (i) performing crop model simulations at scale; and (ii) explorative modelling focussing on crop phe- B. Food Safety Risk Assessment nology studies. Based on data availability both have a strong In the domain of food safety modelling two exploitation focus on The Netherlands as study area (using AgroDataCube scenarios were identified where scientific data analysis work- [8] as input), but the approaches can be extended to other flows and software based resources for knowledge sharing and regions once sufficient data is collected and a suitable crop integration are of extraordinary importance. Both scenarios model has been added to the VRE. nicely complement the activities the community is promoting Initially the well-know WOFOST model [27] has been to harmonise the knowledge produced [28]. integrated so that it can be executed as tasks in the AGINFRA The DEMETER scenario is aiming at developing a working PLUS data analytics platform (see Sec. II-B). DataMiner environment supporting the early identification of issues in the makes algorithms available as Web Processing Service (WPS, food (and feed) chain. This scenario largely build upon the a standard from the Open GeoSpatial Consortium, OGC). workspace, the data analytics and the catalogue to demonstrate Using these facilities a ‘worker’ process has been implemented how KNIME-based data mining workflows can be efficiently that can run large batches (1000 - 10000) of crop simula- shared and applied from within the VRE. tions on the distributed computing infrastructure behind the The RAKIP scenario aims at providing risk assessors and platform, and a ‘scheduler’ process that can divide the total risk modellers with an environment supporting their efforts to workload over all available compute nodes (currently between share their knowledge (data, mathematical model, simulation 6 to 10), and collect all simulations results. The ‘workload’ for results) in a harmonized way. A distinguishing feature of example might consist of running the crop simulations for all this environment is a community-driven food safety model crop parcels of one or more years in The Netherlands, about repository, that contains mathematical models from the area 400,000 crop simulations per year, studying effects of input of predictive microbial modelling and quantitative microbial parameter variations, such as temperature sums or precipitation risk assessment (QMRA). This repository builds upon the amounts, which multiplies the total crop simulations needing FSK-Lab [29], i.e. a community standard to homogenise the to be performed. representation and packaging of all relevant data, metadata A second activity that is examined is the use of explorative and model scripts in a machine-readable format. This is modelling for the estimation of crop phenology characteristics, an extension of the KNIME platform, one of the platforms using available agronomic data, combined with crop develop- underlying the data analytics. ment indicators (e.g. the NDVI vegetation index), derived from The AGINFRA PLUS platform support these scenarios by remote sensing data. This activity uses the AGINFRA PLUS providing: (a) facilities for developing the ontology underlying analytics facilities such as Jupyter Notebooks and RStudio, to the FSK-Lab solution for models representation (VocBench, experiment with agronomic data analytics. The aim is to test see Sec. II-A); (b) two specific processes integrated into the such analytics, providing insight in critical crop development data analytics platform to respectively support the publishing indicators, and to convert these into algorithms deployed of a model into the catalogue and the execution of any model; as DataMiner processes on the VRE to run them at scale. (c) a catalogue where the models are published according Results can then be used to more accurately estimate regional to the community ontology and endowed each with three crop yields, using long-term agronomic statistics and yield actionable resources enabling users to respectively download prediction systems. the model, perform a model simulation by using the default Early stage evaluation results from piloting both agro- parameters, perform a model simulation by tuning the pa- climatic modelling activities indicated that the VRE already rameters; (d) a mind map development and dissemination is regarded as being well equipped for collaborative research. solution facilitating the communication among the members; At that point (about one year ago) there were however (e) a journal-based approach for publishing the models. The reservations concerning the ability of the VRE to support Food Modeling Journal1 has been designed and launched to full agro-climatic modelling workflows, due to some limi- support the needs emerging in this community. It promotes tations regarding the integration of the different processing, analytics and visualisation components available in the VRE. 1 https://fmj.pensoft.net/ 4 11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019 the publishing of Models, Data analytics, Applied study, Data IV. C ONCLUSION paper, and Software description. Thanks to the integration This paper presented the AGINFRA PLUS platform, a of the publishing platform into the VREs (see Sec. II-C) science gateway providing the Agri-food community with a it is straightforward to produce papers linking the available rich array of services oriented to promote the implementa- artifacts, e.g. the models in their actionable form. tion of open science practices. Such a platform is currently The Virtual Research Environments developed for support- supporting three designated communities dealing with crops ing these scenarios are available at https://aginfra.d4science. simulation, food safety risk assessment, and high-throughput org/web/demeter and https://aginfra.d4science.org/web/rakip plant phenotyping scenarios. portal. The platform is bringing into these communities and their working practices a number of benefits including (a) the C. Food Security simplicity for coworkers to perform collaborative work, e.g. the workspace is a working area users can count on to The Food Security Community is focusing on a high- collaborate, the social networking is a means to have informed throughput plant phenotyping scenario. This scenario can help dialogues; (b) the easiness to share results of any form to select crop varieties that better adapt to global changes within and across the boundaries of their communities and in order to respond to the food security challenges. High- the platform itself, e.g. the catalogue is a valuable service throughput phenotyping produces a large amount of data for disseminating research artefacts and enable users to access which need to be integrated and analysed right away. For them, the integration with the OpenAIRE dashboard and the example, in a greenhouse platform, a lot of images of plants scholarly communication platform reduces the gaps with the are taken: 13 images per plants per day are taken in the scholarly communication domain; (c) the attention dedicated Montpellier platform which works on 1600 plants (more than to ease the flowing of existing artefacts into the platform 20,000 images per day). Field platforms produce and need thus to reduce fragmentation and facilitate their reuse, e.g. the a lot of images including UAV or satellite. High-throughput plethora of programming languages and approaches supported phenotyping platforms produce complex data (sensors data, by the analytics facilities make it possible to easily integrate human reading) at different scales (e.g. population, individuals, almost any existing analytics method, the array of solutions molecular). for ontology management facilitate their reuse. The phenomics community needs tools to easily access Overall, the AGINFRA PLUS platform is currently serving to large datasets and to be able to visualize and analyse hundreds of users (more than 340 in Feb. 2019) by 13 active them. Moreover, sharing data, analytics process and results is VREs. In the coming months these figures are going to essential. The objective of this use case is to develop a VRE improve because the project will enter into the community val- for phenomics researchers where these users: have access to idation and uptake phase. In the period Mar. 2018 - Feb. 2019 relevant ontologies; collaborate on building and share semantic the users served by this platform and its VREs performed: resources; have access to phenomics platforms data from the a total of 24,439 working sessions, with an average of circa information system OpenSILEX-PHIS [30]; visualize data; 2,036 sessions per month; a total of 1,959 social interactions, import and run data analytics scripts in different languages with an average of circa 163 interactions per month; a total (R, Python, etc); import or update and run data analytics of 1,842 analytics tasks, with an average of circa 153 tasks workflows (KNIME, Galaxy); share results and work with per month; a total of 387 items have been published into other users. the catalogue including models, research objects, methods, The first evaluation results of the Food Security VRE services, terms, and datasets. indicated that the VRE is useful for collaborative work. The Future developments includes the development of a cata- diversity of tools that are available has also shown interest logue supporting semantic queries, the development of tools from the users. However, there were some reservations on easing the discovery and access to geospatial datasets, the the integration of these which made difficult the execution of development of recommender systems, the development of certain data analysis workflows. Another concern on big data tools supporting the identification of suitable licences for the manipulation and data access has also been noted. Considering produced artifacts. this, recent work has been made to improve these integration ACKNOWLEDGMENT capabilities in order to provide better connected tools. Web This work has received funding from the European Union’s Services based on the Breeding API standards2 had also been Horizon 2020 research and innovation programme under AG- implemented into the OpenSILEX-PHIS system in order to INFRA PLUS project (grant agreement No. 731001). easily access phenotyping data in the VRE. The Virtual Research Environment developed for supporting R EFERENCES this scenario is available at https://aginfra.d4science.org/web/ [1] J. W. Jones, J. M. Antle, B. Basso, K. J. Boote, R. T. Conant, foodsecurity. I. Foster, H. C. J. Godfray, M. Herrero, R. E. Howitt, S. Janssen, B. A. Keating, R. Munoz-Carpena, C. H. Porter, C. Rosenzweig, and T. R. Wheeler, “Toward a new generation of agricultural system 2 https://brapi.org data, models, and knowledge products: State of agricultural systems 5 11th International Workshop on Science Gateways (IWSG 2019), 12-14 June 2019 science,” Agricultural Systems, vol. 155, pp. 269 – 288, 2017. [Online]. [22] F. Perez and B. E. Granger, “IPython: A system for interactive scientific Available: https://doi.org/10.1016/j.agsy.2016.09.021 computing,” Computing in Science & Engineering, vol. 9, no. 3, pp. [2] e-ROSA Consortium, “A roadmap for a pan-european e-infrastructure 21–29, 2007. for open science in agricultural and food sciences,” e-ROSA Roadmap, [23] J. Goecks, A. Nekrutenko, and J. Taylor, “Galaxy: a comprehensive 2018. approach for supporting accessible, reproducible, and transparent com- [3] L. Candela, D. Castelli, and P. Pagano, “Virtual research environments: putational research in the life sciences,” Genome Biology, vol. 11, no. 8, an overview and a research agenda,” Data Science Journal, vol. 12, pp. p. R86, Aug 2010. GRDI75–GRDI81, 2013. [24] M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, G. Appleton, [4] M. W. Maier, “Architecting principles for systems-of-systems,” INCOSE M. Axton, A. Baak, N. Blomberg, J.-W. Boiten, L. B. da Silva Santos, International Symposium, vol. 6, no. 1, pp. 565–573, 1996. [Online]. P. E. Bourne, J. Bouwman, A. J. Brookes, T. Clark, M. Crosas, I. Dillo, Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/j.2334-5837. O. Dumon, S. Edmunds, C. T. Evelo, R. Finkers, A. Gonzalez-Beltran, 1996.tb02054.x A. J. G. Gray, P. Groth, C. Goble, J. S. Grethe, J. Heringa, P. A. C. ’t Hoen, R. Hooft, T. Kuhn, R. Kok, J. Kok, S. J. Lusher, M. E. [5] D4Science Consortium, “D4Science: an e-infrastructure supporting vir- Martone, A. Mons, A. L. Packer, B. Persson, P. Rocca-Serra, M. Roos, tual research environments,” www.d4science.org. R. van Schaik, S.-A. Sansone, E. Schultes, T. Sengstag, T. Slater, [6] EGI Foundation, “EGI e-infrastructure,” www.egi.eu. G. Strawn, M. A. Swertz, M. Thompson, J. van der Lei, E. van [7] OpenAIRE Consortium, “OpenAIRE: the european scholarly communi- Mulligen, J. Velterop, A. Waagmeester, P. Wittenburg, K. Wolstencroft, cation data infrastructure,” www.openaire.eu. J. Zhao, and B. Mons, “The FAIR guiding principles for scientific data [8] H. Janssen, S. Janssen, M. Knapen, W. Meijninger, Y. v. Randen, I. l. management and stewardship,” Scientific Data, vol. 3, p. 160018 EP, Riviere, and G. Roerink, “AgroDataCube: A big open data collection 2016. [Online]. Available: http://dx.doi.org/10.1038/sdata.2016.18 for agri-food applications,” agrodatacube.wur.nl, 2018. [25] P. Prı́ncipe, A. Bardi, A. Vieira, P. Manghi, M. Baglioni, and N. Retberg, [9] C. Caracciolo, A. Stellato, A. Morshed, G. Johannsen, S. Rajbhandari, “Openaire dashboard for research communities: Enabling open science Y. Jaques, and J. Keizer, “The AGROVOC linked dataset,” Semantic publishing for research communities and research infrastructures,” Poster Web, vol. 4, no. 3, pp. 341–348, 2013. presented at the Open Science Conference 2019, Berlin, Germany, 19-20 [10] German Federal Institute for Risk Assessment, “Foodrisk-labs,” March 2019, 2019. https://foodrisklabs.bfr.bund.de/foodrisk-labs/. [26] L. Penev, “From open access to open science from the viewpoint of a [11] M. Assante, L. Candela, D. Castelli, R. Cirilllo, G. Coro, L. Frosini, scholarly publisher,” Research Ideas and Outcomes, vol. 3, p. e12265, L. Lelii, F. Mangiacrapa, V. Marioli, P. Pagano, G. Panichi, 2017. [Online]. Available: https://doi.org/10.3897/rio.3.e12265 C. Perciante, and F. Sinibaldi, “The gcube system: Delivering virtual [27] A. de Wit, H. Boogaard, D. Fumagalli, S. Janssen, R. Knapen, research environments as-a-service,” Future Generation Computer D. van Kraalingen, I. Supit, R. van der Wijngaart, and K. van Systems, vol. 95, no. n.a., pp. 445–453, 2019. [Online]. Available: Diepen, “25 years of the wofost cropping systems model,” Agricultural http://www.sciencedirect.com/science/article/pii/S0167739X17328364 Systems, vol. 168, pp. 154 – 167, 2019. [Online]. Available: [12] AGINFRA Consortium, “The AGINFRA gateway,” https://aginfra. http://www.sciencedirect.com/science/article/pii/S0308521X17310107 d4science.org/. [28] L. U. Haberbeck, C. Plaza-Rodrı́guez, V. Desvignes, P. Dalgaard, [13] M. Assante, L. Candela, D. Castelli, G. Coro, F. Mangiacrapa, P. Pagano, M. Sanaa, L. Guillier, M. Nauta, and M. Filter, “Harmonized terms, and P. Costantino, “Enacting open science by gcube,” in Proceedings of concepts and metadata for microbiological risk assessment models: The the 9th International Workshop on Science Gateways, 2018. basis for knowledge integration and exchange,” Microbial Risk Analysis, vol. 10, pp. 3 – 12, 2018, special issue on 10th International Conference [14] A. Stellato, S. Rajbhandari, A. Turbati, M. Fiorelli, C. Caracciolo, on Predictive Modelling in Food: Interdisciplinary Approaches and T. Lorenzetti, J. Keizer, and M. T. Pazienza, “Vocbench: A web Decision-Making Tools in Microbial Risk Analysis. [Online]. Available: application for collaborative development of multilingual thesauri,” in http://www.sciencedirect.com/science/article/pii/S2352352218300100 The Semantic Web. Latest Advances and New Domains, F. Gandon, [29] M. de Alba Aparicio, T. Buschhardt, A. Swaid, L. Valentin, O. Mesa- M. Sabou, H. Sack, C. d’Amato, P. Cudré-Mauroux, and A. Zimmer- Varona, T. Günther, C. Plaza-Rodriguez, and M. Filter, “Fsk-lab – an mann, Eds. Cham: Springer International Publishing, 2015, pp. 38–53. open source food safety model integration tool,” Microbial Risk Analysis, [15] J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov, “Silk – a link vol. 10, pp. 13 – 19, 2018, special issue on 10th International Conference discovery framework for the web of data,” in Proceedings of the on Predictive Modelling in Food: Interdisciplinary Approaches and Linked Data on the Web Workshop (LDOW2009), Madrid, Spain, April Decision-Making Tools in Microbial Risk Analysis. [Online]. Available: 20, 2009, CEUR Workshop Proceedings, 2009. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S2352352218300136 http://ceur-ws.org/Vol-538/ldow2009 paper13.pdf [30] P. Neveu, A. Tireau, N. Hilgert, V. Nègre, J. Mineau-Cesari, [16] R. Verborgh and M. De Wilde, Using OpenRefine. Packt Publishing, N. Brichet, R. Chapuis, I. Sanchez, C. Pommier, B. Charnomordic, 2013. F. Tardieu, and L. Cabrera-Bosquet, “Dealing with multi-source [17] S. Lohmann, V. Link, E. Marbach, and S. Negru, “WebVOWL: Web- and multi-scale information in plant phenomics: the ontology- based visualization of ontologies,” in Knowledge Engineering and driven phenotyping hybrid information system,” New Phytologist, Knowledge Management, P. Lambrix, E. Hyvönen, E. Blomqvist, V. Pre- vol. 221, no. 1, pp. 588–601, 2019. [Online]. Available: https: sutti, G. Qi, U. Sattler, Y. Ding, and C. Ghidini, Eds. Cham: Springer //nph.onlinelibrary.wiley.com/doi/abs/10.1111/nph.15385 International Publishing, 2015, pp. 154–158. [18] D. Ngo and Z. Bellahsene, “Overview of yam++—(not) yet another matcher for ontology alignment task,” Journal of Web Semantics, vol. 41, pp. 30 – 49, 2016. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1570826816300464 [19] S. Khalifa, Y. Elshater, K. Sundaravarathan, A. Bhat, P. Martin, F. Imam, D. Rope, M. Mcroberts, and C. Statchuk, “The six pillars for building big data analytics ecosystems,” ACM Comput. Surv., vol. 49, no. 2, pp. 33:1–33:36, Aug. 2016. [Online]. Available: http://doi.acm.org/10.1145/2963143 [20] G. Coro, G. Panichi, P. Scarponi, and P. Pagano, “Cloud computing in a distributed e-infrastructure using the web processing service standard,” Concurrency and Computation: Practice and Experience, vol. 29, no. 18, p. e4219, 2017. [21] M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kötter, T. Meinl, P. Ohl, K. Thiel, and B. Wiswedel, “Knime - the konstanz information miner: Version 2.0 and beyond,” SIGKDD Explor. Newsl., vol. 11, no. 1, pp. 26–31, Nov. 2009. [Online]. Available: http://doi.acm.org/10.1145/1656274.1656280 6