<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>GSKY: A scalable, distributed geospatial data-server</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pablo R. Larraondo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sean Pringle</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joseph Antony</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ben Evans pablo.larraondo@anu.edu.au</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Computational Infrastructure, Australian National University ACT</institution>
          <addr-line>2601</addr-line>
          ,
          <country country="AU">Australia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Proc. of the 4th Annual Conference of</institution>
        </aff>
      </contrib-group>
      <fpage>7</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>The rapid growth of earth systems, environmental and geophysical datasets poses a challenge to both end-users and infrastructure providers. GSKY is a scalable, distributed server which presents a new approach for geospatial data discovery and delivery using OGC standards. In this paper we discuss the architecture and motivating use-cases that drove GSKY's design, development and production deployment. We show our approach o ers the community valuable exploratory analysis capabilities, for dealing with petabyte-scale geospatial data collections.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Copyright c by the paper's authors. Copying permitted only for private and academic purposes.
any underlying le format or data organisation. This is achieved by decoupling the data ingestion and indexing
process as an independent service. An ingestion service crawls collections either locally or remotely by extracting,
storing and indexing all spatio-temporal metadata associated with each individual record or data-collection.</p>
      <p>GSKY has functionality for specifying how ingested data should be aggregated, transformed and presented. It
presents an OGC standards-compliant interface, allowing readily accessible data for users via Web Map Services
(WMS), Web Processing Services (WPS) or underlying source data via Web Coverage Services (WCS). We
will also present use-cases where we have used these new capabilities to provide a signi cant improvement over
previous approaches.</p>
      <p>This paper is structured as follows: Section 2 provides background and reviews previous work. Section 3
describes the motivating use-cases that drove the development of GSKY. Section 4 gives a high level architecture
view into GSKY. The paper concludes and proposes future developments.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Background and Previous Work</title>
      <p>
        Amongst all available geospatial data server implementations, THREDDS
        <xref ref-type="bibr" rid="ref4">(Caron and Davis, 2006)</xref>
        and
GeoServer
        <xref ref-type="bibr" rid="ref6">(Deoliveira, 2008)</xref>
        have been popular choices for exposing and interacting with data in the climate
and earth observation communities.
      </p>
      <p>
        THREDDS provides remote access to geospatial data stored in scienti c multidimensional le formats such
as NetCDF, GRIB or HDF. It implements a binary serialization protocol, OPeNDAP
        <xref ref-type="bibr" rid="ref5">(Cornillon et al., 2003)</xref>
        ,
which allows users to e ciently request the contents of les or subsets of them. It also provides capabilities to
aggregate several les along the spatial and temporal dimensions into a single entity using a markup language
called NcML. However, these capabilities are limited and currently do not scale well when the number of les
grows or when dealing with sparse data.
      </p>
      <p>GeoServer has been, for many years, the tool of choice to serve geospatial data to the earth observation and
more generally GIS communities. It uses OGC standards to present data to the user in either rendered images
(WMS), gridded numerical values (WCS), geometries (WFS) and also o ers the possibility to perform analysis
operations (WPS). GeoServer has the ability to abstract the notion of map projections, allowing users to consume
the same data in di erent projections where the requisite transformations occur in real time, server-side.</p>
      <p>Currently GeoServer is unable to provide services directly from source les that form a data-collection. It is
dependent on pre-generation of an internal representation of the data at di erent resolutions which are known
as pyramids1. These pyramids accelerate access to the underlying data-collection, at the expense of having to
compute and store the pyramids beforehand. When the size of the collection being served grows, the e ort to
generate and store these internal products also increases, often leading to signi cant management overheads.</p>
      <p>We note that the data-management models implemented by THREDDS and GeoServer are currently limited
by a) the lack of features which allow easy and performant methods to create aggregations over collections of
les (as in the case of THREDDS), or b) the generation of views over data in real time from the original les
without requiring any intermediate product generation (as in the case of GeoServer).</p>
      <p>Both solutions have been designed to operate as a single server. Hence, the method of scaling up their
respective operational response capacity is often by improving the hardware capabilities of the underlying hosting
machine. This causes cost constraints, eventually limits data that can be served and impedes the end-user
experience.</p>
      <p>As the size of geospatial data collections grows, there is a corresponding need for solutions which allow for
the serving of information directly from underlying 'raw' les. Hand-in-hand with this, one needs to scale-out
services by using clusters of machines to perform the required aggregations and transformations both on-demand
and in a robust, fault-tolerant manner. Recent projects have appeared to address this problem and we review
some of these key e orts.</p>
      <p>
        The EarthServer project2 is an international project funded through the European Union (EU) encompassing
multiple organisations from di erent countries. The Array Database System, RasDaMan
        <xref ref-type="bibr" rid="ref2">(Baumann et al., 1998)</xref>
        ,
a raster database management system, is the underlying technology used by the project. RasDaMan can serve
large collections of satellite and climate data using OGC standards. It o ers comprehensive ingestion mechanisms
allowing users to serve di erent kinds of data, as well as describing and performing complex server-side analytics
using the WCPS3 standard. However, the open source version of RasDaMan cannot use external 'raw' data les
1http://docs.geoserver.org/stable/en/user/tutorials/imagepyramid/imagepyramid.html
2http://www.earthserver.eu/
3http://www.opengeospatial.org/standards/wcps
as its source data - it requires data to be fully ingested into its backing database. This limitation is solved in the
commercial version, however there is not much published information available on the formats and conventions
that the les require and how it implements its distributed computing methods.
      </p>
      <p>
        The Australian Geoscience Data Cube (AGDC) is another project working on the goal of achieving a general
geospatial data platform
        <xref ref-type="bibr" rid="ref11">(Lewis et al., 2016)</xref>
        . AGDC is presented as an open-source platform enabling public
access to data from several earth observation satellites, and o ers analytics capabilities, for example, to study
temporal trends over user de ned geographic areas.
      </p>
      <p>
        GeoTrellis
        <xref ref-type="bibr" rid="ref10">(Kini and Emanuele, 2014)</xref>
        , is a relatively new open-source project for enabling geospatial processing
based on the MapReduce architecture. This project builds on top of the available algorithms in Apache Spark,
allowing for batch and interactive processing on large geospatial datasets.
      </p>
      <p>
        Google Earth Engine (GEE)
        <xref ref-type="bibr" rid="ref8">(Gorelick, 2013)</xref>
        can be considered the model of a production ready, large scale,
distributed geospatial data server. Scientists and users from the general public are relying on it for research
outputs, as well as to rapidly visualise or analyse temporal trends. GEE o ers a complete and well de ned
graphical user interface and programmatic access via its Python and Javascript APIs. Users can de ne their
analysis work ow using these interfaces and submit it to the backend infrastructure. The required computations
are distributed among Google's cloud infrastructure and results are gathered and returned to the user within a
time determined by the complexity of the operation and the billing policy of the user. Apart from the public
APIs, there is little information available about the back-end infrastructure and distribution model used by the
GEE.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>GSKY's Motivating Use-Cases</title>
      <p>The motivation behind the development of GSKY has been driven by the need to expose curated satellite imagery
collections and climate simulation output to NCI's user community. The NCI has experienced signi cant demand
from researchers wanting to combine data from di erent sources, as well as rapidly test new computational
models/algorithms. Often, data fusion processes require inputs to be normalised to allow pixel to pixel comparisons
between di erent datasets. This process imposes signi cant challenges without knowing a priori the di erent
conventions, storage/ le formats, map projections and data-types as used in the curated data collections hosted
at the NCI.</p>
      <p>
        Another major factor for a unifying methodology to work with geospatial data collections is from the machine
learning community. Recent advances in image recognition based on deep learning techniques have opened an
exciting path to explore new ways of using geospatial data
        <xref ref-type="bibr" rid="ref12">(Marmanis et al., 2016)</xref>
        . Speci c hardware and
software to run these techniques is becoming available, but the underlying machine learning algorithms require
access to large collections of ready-to-consume data, which at present is a limiting factor to uptake. GSKY aims
to reduce this gap by o ering geospatial data-as-a-service which can be consumed by these emerging applications.
      </p>
      <p>
        A key project which helped mature GSKY's production deployment is the GEOGLAM Rangeland and Pasture
Productivity (RAPP) project
        <xref ref-type="bibr" rid="ref9">(Guerschman et al., 2015)</xref>
        4. The project is a joint initiative between CSIRO and
the Group on Earth Observations (GEO), aimed at monitoring the condition of the worlds rangelands and
pasture lands in real-time. This project requires synthesis of several satellite products based on MODIS as well
as modeled climatic data for visualization and interactive analysis. The project uses the National Map front-end
infrastructure
        <xref ref-type="bibr" rid="ref3">(Belgun et al., 2015)</xref>
        to expose the data and connect to the services that GSKY provides at the
back-end. Figure 1 is a screen-shot of the production GEOGLAM RAPP portal containing a representation of
the di erent kinds of vegetation covering the planet. Using this application, end-users can interactively visualise
and perform regional time series analysis for di erent satellite products. Apart from MODIS satellite data,
GSKY has been validated with the following collections: ERA-Interim, CHIRPS-2.0, Himawari 8 and global
Landsat 8.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>GSKY's Architecture</title>
      <p>GSKY posits a model of generating meaningful end-user products from the underlying original source data,
without the need to store intermediate products or pre-compute results. Geospatial analysis often involves a
series of data extraction and image transform operations, which can be modeled as a transformation process
of the original data into a speci c product. These operations can be abstractly represented as a sequence of
independent processes in which data ows along a pipeline structure producing the nal result in the end, i.e a
Directed-Acyclic-Graph (DAG). DAGs capture work ows, which in themselves are de ned as a composition of
individual processes which are connected together forming nodes in the graph and which ultimately result in an
output product.</p>
      <p>
        GSKY implements the following OGC standards: WMS, WCS and WPS. These protocols provide a well
de ned interface for the input requests as well as the output responses. Parameters speci ed in these protocols
are used as inputs to feed the previously described work ows whose outputs are then returned to the user, using
standards-prescribed output formats. The implementation of these work ows are generic, hence allowing GSKY
to be easily extended to support Open Street Maps or OPeNDAP
        <xref ref-type="bibr" rid="ref5">(Cornillon et al., 2003)</xref>
        in the future.
      </p>
      <p>The following sections focus on two fundamental components of GSKY. The rst section, the indexing system,
provides details on how metadata is ingested and exposed by the system. The next section provides details on
how work ows are implemented and also o ers examples to help illustrate some of the architectural concepts.
4.1</p>
      <sec id="sec-4-1">
        <title>Indexing System</title>
        <p>Geospatial data collections such as satellite collections or numerical climate simulations are usually stored as
a collection of les. Each of these les contain a small subset of the data, constrained by a) certain spatial or
temporal ranges; b) a set of variables or parameters. Users accessing these collections often require a priori
knowledge of how the data is structured in order to locate les of interest.</p>
        <p>GSKY presents the abstraction of a geospatial data collection to users in a way that individual les are
abstracted to end-users. Users de ne parameters such as the spatial and temporal ranges for their request(s) as
well as the names of any requisite variables. This high level query is transformed by the indexing system into
a list of les that contain parts of the data. The result can then be extracted, aggregated and presented to the
user by GSKY.</p>
        <p>To achieve this level of abstraction, GSKY relies on a heavily optimized PostGIS database, which can be
queried by collection, variable names and by user-de ned spatial and temporal ranges. Metadata is gathered
using crawlers which periodically scan collections on local or remote le systems (or object stores such as Amazon
Web Services S3), automatically extracting all the relevant information for indexing by the database. These
crawlers run independently of other system components, locating new or modi ed les/objects to keep the
database updated.</p>
        <p>The indexing system has been designed to process and serve high volumes of metadata in near real-time. In
production, the database has to be able to process queries in milliseconds, even for queries comprising large
spatial areas or temporal ranges which often result in identifying, sometimes, in the order of tens of millions of
les. The database has been tuned to use indexes and materialised views to achieve this level of performance.
To handle any future metadata processing pressures, the database can be clustered.</p>
        <p>4GEOGLAM RAPP: http://www.geo-rapp.org/, Production-site:http://map.geo-rapp.org/
4.1.1</p>
      </sec>
      <sec id="sec-4-2">
        <title>Distributed Pipelines</title>
        <p>
          At the heart of GSKY's architecture is a distributed processing system conceived as a work- ow engine written
using the Go programming language. This design choice facilitates complex processes to be decomposed into a
series of small generic modules which can work concurrently to produce a result (abstractly captured as a DAG).
Each of these modules have a well de ned input and output type and internally implement the functionality
required to perform a transformation. The modules are then connected to form a network structure (a DAG)
which solves a speci c problem. Most of the ideas behind this architecture have being borrowed from the
Flow Based Programming (FBP)
          <xref ref-type="bibr" rid="ref13">(Morrison, 2010)</xref>
          speci cation. More recently, there have been e orts to
model work ows in a generic way for deployment in distributed systems, such as clusters or the Cloud
          <xref ref-type="bibr" rid="ref1">(Akidau
et al., 2015)</xref>
          . This approach facilitates both composability, where simple modules interact forming complex
structures, and re-usability. Our proposed work ow implementation does not o er a generic execution framework
for computation but rather focuses on expressing commonly used geospatial operations/idioms e ciently.
        </p>
        <p>Figure 2 is an example of a work ow which generates a tile image corresponding to a WMS request. Each
process in the pipeline works concurrently and communication between processes is performed using bounded
queues. The queue stores the outputs of one process which are also the inputs of the next process.</p>
        <p>Speci cally, Figure 2 describes the sequence of processes to generate a tile image. The pipeline is fed by a WMS
request which speci es the product, an area delimited by a bounding box and a time range. Parameters speci ed
in the request are used by the rst process of the pipeline to identify the list of les containing relevant data,
using the previously described index. The second process in the pipeline is responsible for accessing each le and
extracting the requested data into the right resolution and map projection. This step is the most CPU intensive
and IO demanding of the whole pipeline and, in most cases, acts as the limiting factor for the pipeline. To
speed up this process, work can be dynamically distributed between di erent nodes which operate concurrently
on subsets of the les. GSKY uses a form of Remote Procedure Calls (RPC), as a way of distributing work to a
cluster of remote machines over the network and collecting back the results once they have been processed. The
remaining steps in the Figure describe the process of how the data is extracted from di erent les and merged
(ie /col1/a, /col1/b and /col1/c). The nal result is then scaled, corrected and encoded into a WMS compatible
image format such as PNG or JPEG.</p>
        <p>Time series analysis is a common task in geospatial analysis, where the evolution of a parameter is studied
for a certain period of time. This kind of analysis can be achieved using a similar work ow to the one previously
described, but exposed as a WPS.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work</title>
      <p>At the moment, GSKY can compute products on-demand. The published list of map layers and processing
services is static. This is a current limitation of the underlying OGC protocols. Ideally, these interfaces should
be expanded so users can dynamically specify their own operations de ning a product. The new WCPS standard
sets out a domain speci c language that gives users the ability to specify computations combining di erent
products. Studying the WCPS standard and evaluating the possibility and implications of implementing it on
GSKY is one of our next goals.</p>
      <p>Another option is to work on turning GSKY into a distributed generic work ow execution engine. This would
open the possibility for users to be able to de ne and deploy their own processes, de ning custom work ows
based on GSKY's underlying architecture. Having a well de ned interface and general serialization protocol for
the data would mean that processes could be de ned in any programming language.</p>
      <p>A near-term goal is to extend the number of datasets and services that GSKY o ers.</p>
      <sec id="sec-5-1">
        <title>Acknowledgements</title>
        <p>The authors wish to acknowledge funding from the Australian Government Department of Education, through the
National Collaboration Research Infrastructure Strategy (NCRIS) and the Education Investment Fund (EIF)
Super Science Initiatives through the National Computational Infrastructure (NCI), Research Data Storage
Infrastructure (RDSI) and Research Data Services Projects.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Akidau</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bradshaw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chambers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chernyak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fernndez-Moctezuma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Reuven</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. McVeety</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>The data ow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing</article-title>
          .
          <source>Proceedings of the VLDB Endowment 8. 12</source>
          ,
          <issue>1792</issue>
          {
          <year>1803</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Baumann</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dehmel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Furtado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ritsch</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Widmann</surname>
          </string-name>
          (
          <year>1998</year>
          ).
          <article-title>The multidimensional database system RasDaMan</article-title>
          .
          <source>ACM Sigmod Record</source>
          <volume>27</volume>
          ,
          <issue>575</issue>
          {
          <fpage>577</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Belgun</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Grochow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Henrikson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Leihn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mason</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Raghnaill</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. K.</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Simpson-Young</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <source>The Australian National Map.</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Caron</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and E.
          <string-name>
            <surname>Davis</surname>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>UNIDATA's THREDDS data server</article-title>
          .
          <source>22nd International Conference on Interactive Information Processing Systems for Meteorology</source>
          , Oceanography, and Hydrology..
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Cornillon</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , G. J., and
          <string-name>
            <surname>S. T.</surname>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>OPeNDAP: Accessing data in a distributed, heterogeneous environment</article-title>
          .
          <source>Data Science Journal. 2</source>
          ,
          <issue>164</issue>
          {
          <fpage>174</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Deoliveira</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>GeoServer: uniting the GeoWeb and spatial data infrastructures</article-title>
          .
          <source>Proceedings of the 10th International Conference for Spatial Data Infrastructure..</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Evans</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , L. Wyborn,
          <string-name>
            <given-names>T.</given-names>
            <surname>Pugh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Antony</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Gohar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Porter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Smillie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Trenham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ip</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Bell</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>The NCI High Performance Computing and High Performance Data Platform to Support the Analysis of Petascale Environmental Data Collections</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Gorelick</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Google Earth Engine</article-title>
          .
          <source>In EGU General Assembly Conference Abstracts</source>
          , Volume
          <volume>15</volume>
          , pp.
          <fpage>11997</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Guerschman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Held</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Donohue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Renzullo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sims</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Kerblat</surname>
          </string-name>
          , and M.
          <string-name>
            <surname>Grundy</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>The GEOGLAM Rangelands and Pasture Productivity Activity: Recent progress and future directions</article-title>
          .
          <source>In AGU Fall Meeting Abstracts.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Kini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Emanuele</surname>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>GeoTrellis: Adding geospatial capabilities to Spark.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lymburner</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. B. J. Purss</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Brooke</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Evans</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Ip</surname>
            ,
            <given-names>A. G.</given-names>
          </string-name>
          <string-name>
            <surname>Dekker</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          <string-name>
            <surname>Irons</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Minchin</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Mueller</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Oliver</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Ryan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Thankappan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Woodcock</surname>
          </string-name>
          , and L.
          <string-name>
            <surname>Wyborn</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Rapid, highresolution detection of environmental change over continental scales from satellite data: The Earth Observation Data Cube</article-title>
          .
          <source>International Journal of Digital Earth</source>
          <volume>9</volume>
          (
          <issue>1</issue>
          ),
          <volume>106</volume>
          {
          <fpage>111</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Marmanis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Datcu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Esch</surname>
          </string-name>
          , and
          <string-name>
            <given-names>U.</given-names>
            <surname>Stilla</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Deep learning earth observation classi cation using imagenet pretrained networks</article-title>
          .
          <source>IEEE Geoscience and Remote Sensing Letters</source>
          <volume>13</volume>
          (
          <issue>1</issue>
          ),
          <volume>105</volume>
          {
          <fpage>109</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Morrison</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Flow-Based Programming: A new approach to application development</article-title>
          .
          <source>CreateSpace.</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Schnase</surname>
            ,
            <given-names>J. L.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>T. J.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Mattmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Lynnes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Cinquini</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H. A. F.</given-names>
            <surname>Ramirez</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. M.</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Big Data challenges in climate science: Improving the next-generation cyberinfrastructure</article-title>
          .
          <source>IEEE Geoscience and Remote Sensing Magazine</source>
          <volume>4</volume>
          (
          <issue>3</issue>
          ),
          <volume>10</volume>
          {
          <fpage>22</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>