=Paper= {{Paper |id=Vol-1871/paper5 |storemode=property |title=A Science Gateway for Biodiversity and Climate Change Research |pdfUrl=https://ceur-ws.org/Vol-1871/paper5.pdf |volume=Vol-1871 |authors=Donatello Elia,Alessandra Nuzzo,Paola Nassisi,Sandro Fiore,Ignacio Blanquer,Francisco V. Brasileiro,Iana A. A. Rufino,Arie C. Seijmonsbergen,Niels S. Anders,Carlos de O. Galvao,John E. de B. L. Cunha,Mariane de Sousa-Baena,Vanderlei P. Canhos,Giovanni Aloisio |dblpUrl=https://dblp.org/rec/conf/iwsg/EliaNNFBBRSAGCS16 }} ==A Science Gateway for Biodiversity and Climate Change Research== https://ceur-ws.org/Vol-1871/paper5.pdf
                        8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016



    A Science Gateway for Biodiversity and Climate
                  Change Research
 Donatello Elia∗ , Alessandra Nuzzo∗ , Paola Nassisi∗ , Sandro Fiore∗ , Ignacio Blanquer† , Francisco V. Brasileiro‡ ,
            Iana A. A. Rufino‡ , Arie C. Seijmonsbergen§ , Niels S. Anders§ , Carlos de O. Galvão‡ ,
      John E. de B. L. Cunha‡ , Mariane de Sousa-Baena¶ , Vanderlei P. Canhos¶ and Giovanni Aloisio∗k
                        ∗ Fondazione Centro Euro-Mediterraneo sui Cambiamenti Climatici, Lecce, Italy
                                       † Universitat Politecnica de Valencia, Valencia, Spain
                            ‡ Universidade Federal de Campina Grande, Campina Grande, PB, Brasil
                                   § IBED, University of Amsterdam, Amsterdam, Netherlands
                            ¶ Centro de Referência em Informação Ambiental, Campinas, SP, Brasil
                                                k University of Salento, Lecce, Italy



   Abstract—Climate and biodiversity systems are closely in-         dresses the scientific challenges of three multidisciplinary and
terlaced across a wide range of scales. To better understand         highly complementary scenarios, among which the one on
the mutual interaction between climate change and biodiversity       biodiversity, natural resources and climate change represents
there is a strong need for multidisciplinary skills, tools and
a large variety of heterogeneous, distributed data sources. In       the most challenging one from the scientific data management
this regard, the EUBrazilCloudConnect project provides a user-       standpoint. The proposed scientific scenarios require access to
centric research environment built on top of a federated cloud       the project e-infrastructure to run complex workflow pipelines
infrastructure across Europe and Brazil to serve scientific needs.   as well as access to heterogeneous and large datasets for data
One of the test cases implemented in this project focuses on         analysis and visualisation.
climate change and biodiversity research. The BioClimate is the
Science Gateway of the use case. It aims at providing end-users         The Biodiversity and Climate Change use case (BioClimate)
with a highly integrated environment, addressing mainly data
analytics requirements. This paper presents a complete overview      involves multiple heterogeneous data sources (e.g. SEBAL,
about BioClimate and the scientific environment delivered to the     LiDAR, CRU, CMIP5, speciesLink, GBIF, etc.) and several
user community at the end of the project.                            processing pipelines, integrated through the BioClimate Sci-
   Keywords—Science Gateways, Scientific Data Management and         entific Gateway. The gateway sits on top of the databases and
Analytics, Environmental Sciences.                                   enables near-real-time analysis of large volume datasets (from
                                                                     multi-GBs to multi-TBs scale depending on the specific data
                      I. I NTRODUCTION                               source) through the Parallel Data Analysis Service (PDAS).
   Climate and biodiversity systems are closely interlaced           PDAS clusters are deployed on the site where the databases
across a wide range of scales. In order to predict the effects of    are stored providing the end-user with a high-level, parallel,
climate change on the biodiversity system, which is essential        and server-side interface for scientific data analysis.
towards sustainable landscape and eco-services management,              The design of the software infrastructure and the BioCli-
there is a need to further investigate the interaction between       mate Scientific Gateway for end-users facilitates joint research
the climate system and biodiversity.                                 using data that is otherwise difficult to access or for which
   Direct measurements of climate and biodiversity are often         availability is fragmented and/or too large to process using
difficult and time-consuming to obtain, instead it is common         traditional computational means. With regard to existing ap-
practice to use climate and biodiversity indicators. These           proaches and tools that are mainly client-side/desktop based,
interactions can be studied at various scales, ranging from          the use case delivers a well-integrated environment for climate
microscopic scales, and at (genomic, taxonomic, ecosystem)           change and biodiversity research with cloud-based infrastruc-
scales of individual plant and animal species. A multi-scale         ture and server-side capabilities.
and integrated approach is required to investigate the climate-
biodiversity system as a whole. Presently, in this scenario,            This work presents the BioClimate Scientific Gateway, the
researchers and professionals are burdened by scattered data         scientific challenges addressed and the implementation details.
sources, wealth of analysis tools to master and implement, and       The remainder of this work is organised as it follows. Section
computational limitations to upscale their analysis.                 II provides an overview of the BioClimate use case and its
   EUBrazilCloudConnect [1] is a project from the third              main goals. Section III provides a general description of the
coordinated EU-Brazil call. It is a preliminary step towards         BioClimate Scientific Gateway architecture, whereas Section
providing a user-centric environment for the scientific research     IV and Section V give, respectively, a detailed description of
communities to test the execution of challenging applications        the graphic interface and the back-end. Finally, Section VI
exploiting a federated cloud infrastructure. The project ad-         draws the main conclusions and describes the future activities.
                       8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016


   II. A B IODIVERSITY & C LIMATE C HANGE U SE C ASE
   The EUBrazilCloudConnect (EUBrazilCC) use case on
climate change and biodiversity is a data-driven use case,
aiming at better understanding the interactions between the
biodiversity system and the climate system. This use case
focuses on bringing together a wide variety of climate and
biodiversity data and analysis tools into a user-friendly and
web-based Science Gateway to provide an integrated approach
of investigating climate and biodiversity across different tem-
poral and spatial scales.
   To address all these scientific challenges, the use case
joins together heterogeneous data sources, on-premises cloud
infrastructures, multiple data services, and a Science Gateway
into a single, federated trans-Atlantic environment.
   The Science Gateway provides access to historical tempe-
rature and precipitation records, different climate model scena-
rios with predictions of future temperature and precipitation,
Landsat [2] satellite imagery for climate and biodiversity indi-
cators, LiDAR 3D forest metrics and biodiversity indicators at                    Fig. 1. BioClimate high-level use case architecture
a very high resolution, and plant occurrences data for ecologi-
cal niche models for the prediction of future plant distribution
                                                                         •   Usability. The interface is designed to: (i) facilitate the
based on different climate scenarios. The proposed pipelines/
                                                                             end-user to select the target data source, an area of
workflows combine the analysis of data acquired from these
                                                                             interest and the temporal scale; (ii) submit an experiment
different technologies to study the impact of climate change in
                                                                             computation; (iii) visualise the processed results in terms
regions with high interest for biodiversity conservation, such
                                                                             of maps, graphs, tables and comparative charts; and (iv)
as the Brazilian Amazon and the semi-arid Caatinga regions
                                                                             download the aggregated results and products regarding
in Brazil. The analysis of remote sensing images provides 3D
                                                                             satellite images and 3D vegetation products (CSV, Raster,
information concerning the structure of the vegetation, which
                                                                             GeoTIFF and PNG formats).
improves biodiversity indicators such as the energy balance
and evapotranspiration.                                                                III. G ATEWAY A RCHITECTURE
   The EUBrazilCC infrastructure provides the computing                   The software architecture of the use case is shown in Figure
power needed to support data processing and analysis, the              1. The BioClimate Scientific Gateway represents the high-level
management of metadata to enable search and discovery                  user interface provided by the use case. It allows data access,
as well as provenance management to address re-usability               analysis and visualisation over multiple, heterogeneous data
and reproducibility, both strongly relevant for scientific data        sources, by exposing an integrated view of the data level. It
environments. The BioClimate Scientific Gateway integrates in          supports several features, such as time-series and statistical
a web-based environment the data sources and the processing            analysis, data inspection, intercomparison and subsetting.
and analysis capabilities exploiting the project infrastructure.          The elastic-job engine takes care of the execution of the
More specifically, the gateway has been designed to fulfil some        requests submitted through the gateway interface by translating
key requirements:                                                      the requests in PDAS tasks and then properly scheduling
  • Integration of heterogenous data sources. The gateway              the jobs on the available resources. To guarantee scalability,
    provides a unified interface to access and process satellite       it elastically adapts to the analytics workload exploiting the
    images (from Landsat), environmental data, future cli-             underlying cloud resources. The engine interacts with the
    mate scenarios, biodiversity data like species distributions       Infrastructure Manager (IM) [4] to deploy and un-deploy
    and LiDAR datasets related to some target areas. Further-          PDAS cluster instances on-demand. A detailed description of
    more, the gateway provides also metadata information               the implementation and the main features of both the Science
    describing these data sources.                                     Gateway interface and the engine is provided in the next
  • Implementation of processing tools. To support data                sections.
    analysis, several tools are integrated in the gateway to              A system catalog is used by both the front-end and the back-
    allow: computation of 3D vegetation products based on              end to store useful information regarding user management,
    LiDAR data [3] (e.g. Digital Surface Model (DSM), Digi-            experiment execution requests and results, PDAS cluster usage
    tal Terrain Model (DTM), Canopy Height Model (CHM),                history and it also serves as a centralised data repository.
    Relative Height at 50% (RH50)), execution of Ecological               The PDAS, a core component of the Ophidia project [5], [6],
    Niche Modeling over species data and processing of                 provides support in terms of data analytics applied to large sci-
    datasets from climate models and the SEBAL algorithm.              entific datasets. It includes functionalities to deal with different



                                                                   2
                        8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016


scientific data formats, such as NetCDF (Network Common                        available under the Open Database License by Climatic
Data Form) [7] and satellite data, and allow mathematical and                  Research Unit, University of East Anglia.
statistical operations on this data. Python scripts, integrated in          Finally, security cuts across the whole architecture and
the PDAS, provide additional functionalities to process LiDAR            is taken into account at several levels. With regard to the
products and interact with external tools (e.g GDAL [8]) and             front-end, the security is implemented in terms of user au-
services (e.g OpenModeller [9]).                                         thentication. In order to avoid potential attacks that aim at
   The gateway also provides access to the BioClimate Clear-             stealing passwords, the system employs a technique based on
ing House, a database where the user can persistently store              salted password hashing, based on a Java implementation of a
the results of the experiment run during a session and retrieve          Cryptographically Secure Pseudo-Random Number Generator,
them through the search functionalities.                                 called Password-Based Key Derivation Function 2 (PBKDF2)
   The lowest layer of the diagram comprises the several                 [16]. Additionally, HTTPS is used to provide encryption for
private clouds, running OpenNebula or OpenStack at the                   the communications between client and server.
Infrastructure as a Service (IaaS) level, and the data sources,             At the elastic-job engine level, the PDAS terminal is used
made available by the project partners or already available              to send requests to a PDAS server interface. It can exploit
from national and international agencies, which are part of              the X509v3 digital certificates-based authentication and the
the infrastructure with a more static setup.                             VOMS-based authorisation. Different levels of privileges are
   The data sources integrated through the gateway are re-               defined to distinguish user roles locally at each PDAS server or
ported in the following:                                                 globally at the VOMS server. For this purpose, a GSI/VOMS
                                                                         enabled interface, supporting both X.509 certificates and
  • SEBAL datasets. These are an output of satellite images              VOMS-based authorisation and addressing the interoperability
    series (Landsat) processed by the SEBAL [10], [11]                   with the EGI Fed Cloud environment [17], has been defined.
    algorithm to produce estimates of energy balance and
    evapotranspiration of water to the atmosphere. Remote                               IV. U SER INTERFACE INSIGHTS
    sensing data are provided by the United States Geological               In order to address portability of the system and the
    Survey (USGS) and the National Aeronautics and Space                 separation of concerns between the presentation layer and the
    Administration (NASA). In particular, the infrastructure             business logic, the gateway has been implemented according
    allows processing of Landsat data coming from the                    to the Model-View-Controller pattern.
    Brazilian Semiarid region.                                              The presentation layer, running on the client side (i.e. a
  • LiDAR data. For the areas near Manaus in Brazil, where               browser), provides a rich user interface to submit the data
    hyper-spectral imagery is apparently absent, EUBrazil                analysis tasks and visualise their results. It is implemented as
    Cloud Connect will leverage of the available LiDAR data              a JavaScript web application based on the ExtJS library [18],
    provided by EMBRAPA [12] (Brazilian Agricultural and                 which offers a number of gadgets such as panels, charts and
    Livestock Research Corporation). Vegetation and terrain              grids, and Google Maps API [19] for the visualisation of geo-
    metrics represent the key indicators that can be inferred            referenced data.
    from these datasets.                                                    The server side of the Science Gateway implements the
  • Biodiversity data sources. The speciesLink datasets [13],            business logic to manage users, handle the requests and the
    provided by CRIA, the Reference Center on Environmen-                post-processing of the results and is based on Java and Apache
    tal Information, are an output of networking activities              Struts2 framework [20].
    to provide free and open access to 7.3 million primary                  To increase the performance and make the output visuali-
    research-grade data, derived from the federation of 350              sation faster, it has been decided to perform the heavier tasks,
    Brazilian biodiversity datasets, gathered from 150 insti-            related to the post-processing of the outputs, on the server side
    tutions in Brazil and abroad. They represent valuable                and to present the ready-to-use result to the JavaScript library
    biodiversity data sources.                                           on the presentation layer.
  • Climate data from the CMIP5 Federated Data Archive                      Usability has been addressed by defining and implement-
    (ESGF) [14]. The Coupled Model Intercomparison                       ing a set of pre-defined experiments regarding the different
    Project (CMIP) provides a community-based infrastruc-                data sources and type of analysis. Each experiment defines
    ture in support of climate model diagnosis, validation,              a customisable template to perform data analytics tasks on
    intercomparison, documentation and data access. CMCC                 climate and biodiversity data and requires a specific pipeline
    provides about 100TB of data related to three different              of operations, including subsetting, data reduction and mathe-
    models, NetCDF format, CF conventions. Starting from                 matical/statistical functions.
    these datasets, multiple climate indicators can be com-                 The following subsections provide a description of the main
    puted.                                                               views and interfaces made available by the gateway.
  • Climate data from observed data. These high-resolution
    gridded datasets (CRU TS v.3.23 [15]) provide monthly                A. Interactive analysis
    values for several variables, such as temperature and                  The ”Interactive analysis” panel allows a real-time, ex-
    precipitation, for an historical time period and are made            ploratory analysis of time series from the climate data available



                                                                     3
                        8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016




                     Fig. 2. Interactive analysis                                Fig. 3. SEBAL Interannual analysis compute interface



in the use case. In particular, it provides access to CRU
historical data (temperature and precipitation variables) and
future simulated data from the CMIP5 experiment (maximum
and minimum temperatures from different climate models and
scenarios).
   As shown in Figure 2, the interface allows the selection
of a dataset and a variable from the list of datasets/variables
available and a point from the map. The bottom section of the
Science Gateway displays the result of the analysis in terms
of: (i) a chart with the time series and its trend line and (ii) a
table with a comprehensive set of aggregated statistics.

B. Batch analysis
                                                                                  Fig. 4. SEBAL Interannual analysis details interface
   The ”Compute” panel provides the features to define and
submit complex experiments regarding the available data
sources. For each experiment, a map for spatial selection and                based on maximum and minimum temperature are availa-
a form to set the input parameters is provided. The following                ble for comparison (i.e. TXx, TNx, TXn, TNn [22]).
experiments are defined:                                                   • Ecological Niche Modelling (ENM) experiment integrates
   • Interannual analysis of SEBAL output (see Figure 3)                     the functionalities available through the OpenModeller
     provides information about interannual trends and sta-                  Web Service API to create and project models defined
     tistical information of a specific SEBAL variable. The                  over occurrences of biodiversity data. This experiment
     Science Gateway integrates data processed by the SEBAL                  allows the comparison of the projections of models into
     algorithm and provides functionalities to analyse several               three different environmental scenarios (present, future
     variables produced by this algorithm (e.g. Enhanced                     optimistic and future pessimistic). The models are created
     Vegetation Index, Leaf Area Index, Normalized Diffe-                    with the maximum entropy algorithm [23] and are based
     rence Vegetation index, etc.). The interface allows both                on the species occurrences selected by the user.
     spatial and temporal selection.                                       • LiDAR products intercomparison allows comparison and
   • Climate and SEBAL variables intercomparison allows                      evaluation of the statistical relationship between LiDAR
     the comparison of the behaviour of climate and SEBAL                    products available through the gateway (e.g. DSM, DTM,
     variables. In particular it supports analysis over the vari-            CHM). In this case, a LiDAR tile can be selected from
     ables produced by the SEBAL algorithm and variables                     the map.
     (precipitation and temperature) from historical climate               • Relative Height analysis of LiDAR data provides infor-
     data. From a scientific point of view, this experiment pro-             mation about relative height at different percentiles (25%,
     vides useful information about the relationship between                 50%, 66%, 75% and 90%) of the points in a LiDAR tile.
     climate and vegetation indices.
   • Climate indices intercomparison allows comparison of                C. Experiment visualisation & download
     indicators computed on CMIP5 datasets belonging to                    Once the computation of the experiment is completed,
     different climate models and future emission scenarios              details about the experiment are available through the ”Expe-
     (RCP4.5 and RCP8.5 [21]). Four well-known indicators                riment Details” section. Figure 4 displays the output produced



                                                                     4
                        8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016




           Fig. 5. LiDAR intercomparison details interface                                Fig. 7. Monitoring Dashboard



                                                                       spatial domain used for the experiment, (ii) experiment type
                                                                       and (iii) submission date.
                                                                       E. Infrastructure Monitoring
                                                                          The BioClimate Scientific Gateway includes two admini-
                                                                       strative interfaces that (i) allow managing users and their
                                                                       privileges and (ii) provide some information about the re-
                                                                       sources exploited dynamically by the gateway (i.e. PDAS
                                                                       cluster instances) as well as some statistics regarding the
                                                                       number of experiments executed in terms of their type and
                                                                       status. Through this dashboard (see Figure 7) it is possible to
                                                                       get some insights about the use of the system by the end-users.
                                                                       The charts mainly provide real-time monitoring information
       Fig. 6. Climate-SEBAL intercomparison details interface         regarding the number of experiments running/pending and the
                                                                       status of the resources. In particular, a histogram shows the set
                                                                       of experiments and their distribution across the active PDAS
by a SEBAL interannual experiment, whereas Figure 5 and                instances for the last couple of minutes, whereas a pie chart
Figure 6 display the output produced by a LiDAR inter-                 shows the set of clusters currently running the experiments.
comparison experiment and Climate-SEBAL intercomparison
experiment respectively.                                                                 V. E LASTIC - JOB ENGINE
   In particular, to better suit the experiment peculiarities, a          The elastic-job engine is designed to guarantee fast process-
specific detail view is provided for each experiment defined           ing of the user requests by exploiting dynamically and elasti-
above. Hence, various gadgets organised in different fashions          cally the federated cloud infrastructure. To meet scalability and
are used to display the results, among these are: line charts          performance requirements, the engine is implemented as multi-
to display statistical values and trend lines; scatter plots to        threaded daemon, based on GNU C libraries, that exploits the
evaluate variable and indicators correlation; tables to show           PDAS capabilities to perform pipelines of analytics tasks.
the results and statistical values; maps with the environmental           Data-driven processing pipelines, based on PDAS operators,
scenario; images of the LiDAR products; and histograms of              have been defined integrating different tools, services and data
the point distribution.                                                formats.
   Most of the information provided through the gadgets is                Management of the workload is performed exploiting a
also available for download in CSV, raster, GeoTIFF or PNG             smart scheduling algorithm, which provides dynamic job
format, depending on the type of experiment run. Furthermore,          scheduling over a set of queues. A job queue is associated to
metadata regarding the experiment is available in the same             each PDAS cluster running on the infrastructure. To horizon-
view.                                                                  tally scale on the workload, a new PDAS instance is deployed
                                                                       automatically on the private cloud resources when the number
D. BioClimate Clearing House                                           of pending jobs on all the queues exceed a configurable
   The BioClimate Clearing House system allows users to                threshold. A more detailed description of the automated cloud
store a relevant experiment, run during a session, for future          deployment (through the elastic-job engine) of the PDAS, as
analysis. A smart search feature is available to filter out the        well as of the queue policy adopted and its rationale, are out
experiments saved into the Clearing House, based on: (i)               of the scope of this paper and can be found in [24].



                                                                   5
                        8th International Workshop on Science Gateways (IWSG 2016), 8-10 June 2016


A. PDAS                                                                                             ACKNOWLEDGMENT
                                                                            This work was supported by the EU FP7 EUBrazilCC
   As mentioned before, the PDAS provides the capabilities to             Project (Grant Agreement 614048), and CNPq/Brazil (Grant
perform data analytics on large scientific datasets and includes          Agreement 490115/2013-6).
a set of libraries able to deal with different data formats. In the
EUBrazilCC project, the PDAS addresses scientific challenges                                             R EFERENCES
related to the BioClimate use case and it is used for, both batch          [1] EUBrazilCC. [Online]. Available: http://eubrazilcloudconnect.eu
and interactive data analysis on NetCDF, LiDAR and remote                  [2] The landsat program. [Online]. Available: http://landsat.gsfc.nasa.gov/
sensing data.                                                              [3] M. A. Lefsky, W. B. Cohen, G. G. Parker, and D. J. Harding, “Lidar
                                                                               remote sensing for ecosystem studies,” BioScience, vol. 52, no. 1, pp.
   All the outputs of the PDAS are stored in JSON format. This                 19–30, 2002. [Online]. Available: http://bioscience.oxfordjournals.org/
eases the integration of the results into web-contexts like the                content/52/1/19.short
                                                                           [4] M. Caballer, I. Blanquer, G. Moltó, and C. Alfonso, “Dynamic mana-
BioClimate Scientific Gateway and the parsing of the outputs                   gement of virtual infrastructures,” Journal of Grid Computing, vol. 13,
from JavaScript and Python-based applications.                                 no. 1, pp. 53–70, 2014.
                                                                           [5] S. Fiore, A. D’Anca, C. Palazzo, I. T. Foster, D. N. Williams, and
   To address the data analytics requirements and support the                  G. Aloisio, “Ophidia: Toward big data analytics for escience,” in
processing pipelines of the use case, several new features                     Proceedings of the International Conference on Computational Science,
and mathematical functionalities have been developed during                    ICCS 2013, Barcelona, Spain, 5-7 June, 2013, 2013, pp. 2376–2385.
                                                                           [6] S. Fiore, C. Palazzo, A. D’Anca, I. Foster, D. N. Williams, and
the project lifetime. In particular, regarding the interactive                 G. Aloisio, “A big data analytics framework for scientific data mana-
analysis, an operator that allows data inspection and on-the-                  gement,” in Big Data, 2013 IEEE International Conference on, Oct
fly exploration of time series has been implemented, whereas                   2013, pp. 1–8.
                                                                           [7] R. K. Rew and G. P. Davis, “The unidata netcdf: Software for sci-
to run the batch experiments, processing pipelines made up                     entific data access,” in Sixth International Conference on Interactive
of several new operators and functions have been defined. To                   Information and Processing Systems for Meteorology, Oceanography,
integrate external tools, an operator to run scripts has also                  and Hydrology, 1990, pp. 33–40.
                                                                           [8] Gdal library. [Online]. Available: http://www.gdal.org/
been developed. Besides the previous extensions, the import                [9] M. E. Souza Muñoz, R. Giovanni, M. F. Siqueira, T. Sutton, P. Brewer,
process has also been optimised to reduce the time required                    R. S. Pereira, D. A. L. Canhos, and V. P. Canhos, “openmodeller: a
to import large-scale datasets such as SEBAL output data.                      generic approach to species’ potential distribution modelling,” GeoIn-
                                                                               formatica, vol. 15, no. 1, pp. 111–135, 2009.
Finally, to automate the deployment of PDAS instances in                  [10] W. Bastiaanssen, M. Menenti, R. Feddes, and A. Holtslag, “A remote
the EUBrazilCC federated infrastructure, some cloud-based                      sensing surface energy balance algorithm for land (sebal). 1. formula-
scenarios, based on RADL files [4], have been implemented                      tion,” Journal of hydrology, vol. 212, pp. 198–212, 1998.
                                                                          [11] W. Bastiaanssen, H. Pelgrum, J. Wang, Y. Ma, J. Moreno, G. Roerink,
as reported in detail in [24].                                                 and T. Van der Wal, “A remote sensing surface energy balance algorithm
                                                                               for land (sebal).: Part 2: Validation,” Journal of hydrology, vol. 212, pp.
                                                                               213–229, 1998.
                                                                          [12] Embrapa. [Online]. Available: https://www.embrapa.br/
                       VI. C ONCLUSION                                    [13] C. Centro de Referencia em Informacao Ambiental. Specieslink service.
                                                                               [Online]. Available: http://splink.cria.org.br/
                                                                          [14] K. E. Taylor, R. J. Stouffer, and G. A. Meehl, “An overview of cmip5
   During the final validation phase of the EUBrazilCC project,                and the experiment design,” Bulletin of the American Meteorological
the BioClimate use case was highly appreciated by the end-                     Society, vol. 93, no. 4, pp. 485–498, 2012.
users, due to its ability to provide and deliver in the same              [15] I. Harris, P. Jones, T. Osborn, and D. Lister, “Updated high-resolution
                                                                               grids of monthly climatic observations - the cru ts3.10 dataset,” Inter-
environment tools, pipelines, analysis/visualisation features,                 national Journal of Climatology, vol. 34, no. 3, pp. 623–642, 2014.
and several data sources in an integrated manner.                         [16] B. Kaliski, “Pkcs #5: Password-based cryptography specification
                                                                               version 2.0,” RFC 2898, Sep. 2000. [Online]. Available: http:
   User experience was good and the change of paradigm                         //tools.ietf.org/html/rfc2898
(process the data on the server-side) was evaluated as the                [17] Egi fedcloud. [Online]. Available: http://www.egi.eu/infrastructure/
key added value. Despite it requires a learning process, the                   cloud/
                                                                          [18] Extjs library. [Online]. Available: http://docs.sencha.com/extjs/
BioClimate Scientific Gateway provides multiple views and                 [19] Google maps api. [Online]. Available: https://developers.google.com/
analyses of the retrospective data gathered. High level user                   maps/
experience and usability have been two key requirements                   [20] Apache struts2 framework. [Online]. Available: https://struts.apache.org/
                                                                          [21] Rcp emission scenarios. [Online]. Available: http://www.wmo.int/pages/
considered in the implementation phase.                                        themes/climate/emission scenarios.php
   A lot of interest was also raised by governmental & envi-              [22] Climate change indices. definitions of the 27 core indices. [Online].
ronmental agencies (both research & education) especially in                   Available: http://etccdi.pacificclimate.org/list 27 indices.shtml
                                                                          [23] Maximum entropy algorithm. [Online]. Available: http://openmodeller.
Brazil. A set of follow-up actions will be put in place from the               sourceforge.net/algorithms/maxent.html
different partners even beyond the project lifetime (that was             [24] S. Fiore et al., “Big data analytics for climate change and biodiversity in
part of the project sustainability plan).                                      the eubrazilcc federated cloud infrastructure,” in Proceedings of the 12th
                                                                               ACM International Conference on Computing Frontiers, CF’15, Ischia,
   Finally, the impact on the user community was very high.                    Italy, May 18-21, 2015, 2015, pp. 52:1–52:8.
The gateway was evaluated as seamlessly, flexibly and ef-
ficiently able to integrate a comprehensive and useful set
of scientific data management tools to increase the mutual
understanding between climate change and biodiversity.



                                                                      6