<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Hajira Jabeen, Phil Archer, Simon Scerri, Aad Versteden, Ivan Ermilov, Giannis Mouchakis, Jens Lehmann, Soeren Auer e H2020 BigDataEurope Project Consortium c/o Fraunhofer IAIS, Sankt Augustin, Germany info@big-data-europe.eu</article-title>
      </title-group>
      <abstract>
        <p>e BigDataEurope (BDE) project is developing exactly the kind of computing infrastructure that European stakeholders need when handling large volumes of data in a variety of formats; the results are open-source and their use is completely free. Coordinated by Fraunhofer IAIS, BDE is working directly with partners that represent the seven Societal Challenges identied by the European Commission (Health, Food, Energy, Transport, Climate, Social Sciences and Security). For each community, a pilot that makes use of BDEs technology stack to address the Big Data needs identied by these challenges is well under way. 1hp://sansa-stack.net/ which means that it is feasible to point a Compose app at a Swarm cluster and make its use possible in the same manner as if a single Docker host is being used. It is notable that the latest Docker components provide greater resemblance to Kubernetes in terms of orchestration features, and Swarm presents a beer choice in terms of shiing from a local/development environment to a cluster. e BDE Team provides baseline Docker images for Apache Hadoop, Spark, Flink and many others. Components were selected based on the requirements gathered from the seven Societal Challenges. us, the Platform makes it feasible to perform a variety of big data tasks, including message passing (Kaa, Flume), storage (Hive, Cassandra). e platform is able to handle RDF triples at scale using components like FOX, SemaGrow and 4Store; with particular emphasis on the triplication of geospatial data using GeoTriples, Sextant and Strabon. BDI has enriched the Docker platform, a high-level depiction of which is shown in Figure 1, with a layer of supporting services, helping in the setup, maintenance and monitoring of the pipeline and workows: e Init daemon allows to dene workows by monitoring the start-up status of inter-dependent Docker components. e Pipeline Service and Builder are developed to support the creation of workows. e Pipeline Monitor front-end demonstrates the current status of the Docker components. e Integrator UI integrates the dierent ocial Web UIs of select pipeline components under one Integrated and personalised view. Furthermore, the Swarm UI visualises the status of a swarm cluster and allows to scale and monitor the cluster services.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>THE BIG DATA INTEGRATOR PLATFORM</p>
      <p>Swarm-based networking
Load Balancing
Service Discovery
Multi-host networking with integrated KV-Store</p>
      <p>Fault tolerance</p>
      <p>Docker Compose helps to create multiple containers on multiple
nodes using a single command and a single compose le. Docker
Compose V2 and Docker Swarm aim to implement full integration,
For BDI platform progress updates please refer to the dedicated
page2; or try it out or engage with our community3.</p>
      <p>ACKNOWLEDGMENTS</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>