<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Big Data Processing and Semantic Web Technologies for Decision Making in Hazardous Substance Dispersion Emergencies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Athanasios Davvetas</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Iraklis A. Klampanos</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Spyros Andronopoulos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giannis Mouchakis</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stasinos Konstantopoulos</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Ikonomopoulos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vangelis Karkaletsis</string-name>
          <email>vangelisg@iit.demokritos.gr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INRASTES, NCSR `Demokritos'</institution>
          ,
          <addr-line>Aghia Paraskevi 153 10</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Informatics and Telecommunications, NCSR `Demokritos'</institution>
          ,
          <addr-line>Aghia Paraskevi 153 10</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Emergencies that involve the release of hazardous substances into the atmosphere a ects life and nature for several years. The timely and reliable estimation of the expected consequences on people and the environment facilitates informed decision making and timely response. Here, we demonstrate a tool that leverages Big Data and Semantic Web technologies to estimate the source location and the expected dispersion of the plume and to link this against geo-located data about people, infrastructure, industry and other production units, and any other information relevant to potential e ects on the population and the environment.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Emergencies that involve the release of hazardous substances into the atmosphere
may a ect life and nature for several years. Such releases have occurred in the
past, with the Chernobyl accident of 1986 being one of the most notable example.
In the Chernobyl case, the accident was announced to the global community with
signi cant delay, and was inferred due to readings and analysis by neighbouring
countries. It subsequently a ected most of Europe.</p>
      <p>It is self-evident that in such an emergency, as well as in many less notable
but sometimes equally hazardous cases, the timely and reliable estimation of
the release origin and of the expected consequences facilitates informed decision
making and timely response. The demonstration presented focuses speci cally
on the problem where no information is known about the release itself except
from readings at monitoring stations. Under these circumstances, decision
makers need a tool that uses measurements and atmospheric conditions to estimate
the source location and the expected dispersion of the plume. Information about
the dispersion is used to link against geo-located data about people,
infrastructure, industry and other production units, and any other information relevant
to potential e ects on the population and the environment.</p>
    </sec>
    <sec id="sec-2">
      <title>Demonstrated Technologies</title>
      <p>
        The demonstration is deployed on the BDE Platform, a data management and
processing environment that leverages Semantic Web technologies to handle the
integration of heterogeneous data [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. For the application demonstrated here,
we used the platform's tools for executing models on HPC infrastructures and
managing the resulting datasets and their provenance metadata, and its tools for
integrating heterogeneous data and, speci cally here, having a single SPARQL
endpoint that federates multiple RDF and GIS data stores.
2.1
      </p>
      <sec id="sec-2-1">
        <title>Managing Inverse Dispersion Modelling Datasets</title>
        <p>
          Atmospheric dispersion models are computational codes that simulate the
processes of transport and di usion of air pollutants, as well as other physical
processes that occur during dispersion, such as deposition on the ground and
transformations (chemical reactions, radioactive decay, etc.) Dispersion model
calculations are based on meteorological data. We use the NOAA HYSPLIT
atmospheric dispersion model, which is known to work well with our weather data
produced by the WRF atmospheric model [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Air concentrations are calculated
on a 2-dimensional grid of 15km 15km with a temporal step of 1hr.
        </p>
        <p>
          The core idea is to use historical weather data in order to create a set of
climatological regimes that represent the European climate. These are subsequently
employed to pre-compute the dispersion patterns for a number of known
locations of nuclear stations in Europe and to store these dispersion patterns in a
Big Data infrastructure along with provenance metadata about the conditions
and parameters used for the computation. In the event of an emergency, the
application searches for pre-computed dispersion patterns computed under
conditions and parameters that match the current weather data. These patterns
are then used to estimate the source location based on pollutant concentration
measurements and to predict future dispersion. This has been shown to be
accurate enough for immediate response in the case of an emergency, before more
accurate results can be computed several hours later by executing HYSPLIT on
actual weather data and pollutant concentration measurements [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>
          In the work demonstrated here, we build upon previous work on using the
HDFS and Hive components of the BDE Platform to store and access weather
data and the Cassandra component to store metadata [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. That work is
complemented with new BDE Platform components that implement pattern matching
methods for identifying similar weather patterns.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Data Integration through Federated Querying</title>
        <p>
          Semagrow is one of the core semantics-aware components of the BDE
Platform, providing client applications with a uniform perspective of heterogeneous
data stored in heterogeneous data management and processing infrastructures.
Semagrow is a federated SPARQL query processing system that transparently
selects relevant data sources, optimizes query plans, and applies the appropriate
vocabulary transformations to hide schema heterogeneity [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The Semagrow
execution engine also supports multiple query languages, including CassandraQL
and SQL, again transparently serializing the query plan to the target store's
query language and translating and joining the partial responses into the overall
SPARQL query response [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>
          In the application demonstrated here, Semagrow is used to link the cells
of the modelling grid access via an stSPARQL endpoint [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] with population
information from the Geonames dataset access via a SPARQL endpoint.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Description of the Demonstrated Application</title>
      <p>A screen-cast of the application can be seen on https://vimeo.com/227245883
and the source code is is at https://github.com/big-data-europe/pilot-sc5-cycle3</p>
      <p>The user can simulate the input from the xed radiation detection network
as well as from portable radiation detection devices. Then the user is asked to
Listing 1. SPARQL query template used in the demo. ?dispersCell is binded
with the dispersion model grid cell for which we retrieve more information.
SELECT ?lat ?long ?name ?population WHERE {
?dispersCell &lt;http://strdf.di.uoa.gr/ontology#hasGeometry&gt; ?cell .
?populatedLoc &lt;http://www.opengis.net/ont/geosparql#asWKT&gt; ?point ;
&lt;http://www.w3.org/2003/01/geo/wgs84_pos#lat&gt; ?lat ;
&lt;http://www.w3.org/2003/01/geo/wgs84_pos#long&gt; ?long ;</p>
      <p>&lt;http://www.geonames.org/ontology#name&gt; ?name ;
&lt;http://www.geonames.org/ontology#population&gt; ?population .</p>
      <p>FILTER &lt;http://strdf.di.uoa.gr/ontology#within&gt;( ?point, ?cell ) }
select weather data, accessing real-time weather data is feasible. The application
visualizes the weather using arrows. Based on this input, the application extracts
predominant weather patterns and identi es the closest-matching pre-calculated
dispersion modelling results to immediately show dispersion results for two major
pollutants (Cs-137, I-131). The user can then choose the pollutant detected and
the method for estimating source location. Given these choices, the application
estimates and reports the three most likely pollution sources. The concentration
plumes of each station is drawn on the map.</p>
      <p>The map is also enriched with numerical information about the areas a ected
by the plume, and the user can lter these results by moving a slidebar. When
showing population, in the example in the video, the slidebar sets the minimum
population for showing an a ected place on the map.</p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>The work described here has received funding from the European Union's
Horizon 2020 research and innovation programme under grant agreement No 644564.
For more details, please visit https://www.big-data-europe.eu</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scerri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Versteden</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pauwels</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Charalambidis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Konstantopoulos</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , et al.:
          <article-title>The BigDataEurope platform | supporting the variety dimension of Big Data</article-title>
          .
          <source>In: Proc. 17th Intl Conference on Web Engineering (ICWE</source>
          <year>2017</year>
          ), Rome, Italy.
          <source>LNCS 10360</source>
          . Springer (
          <year>June 2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Charalambidis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Troumpoukis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Konstantopoulos</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>SemaGrow: Optimizing federated SPARQL queries</article-title>
          .
          <source>In: Proc. 11th Intl Conference on Semantic Systems (SEMANTiCS</source>
          <year>2015</year>
          ), Vienna, Austria (
          <year>September 2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Klampanos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pappas</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andronopoulos</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davvetas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ikonomopoulos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karkaletsis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Identifying patterns in the weather of Europe for source term estimation</article-title>
          .
          <source>EGU General Assembly Conference Abstracts</source>
          <volume>19</volume>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Klampanos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vlachogiannis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andronopoulos</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Co n~o,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Charalambidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Lokers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Konstantopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Karkaletsis</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          :
          <article-title>Towards supporting climate scientists and impact assessment analysts with the Big Data Europe platform</article-title>
          .
          <source>EGU General Assembly Conference Abstracts</source>
          <volume>18</volume>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Konstantopoulos</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Charalambidis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mouchakis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Troumpoukis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jakobitch</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karkaletsis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Semantic Web technologies and Big Data infrastructures: SPARQL federated querying of heterogeneous Big Data stores</article-title>
          .
          <source>In: Proc. of ISWC 2016 Demos and Posters Track</source>
          , Kobe, Japan (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Kyzirakos</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karpathiotakis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koubarakis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Strabon: A semantic geospatial DBMS</article-title>
          .
          <source>In: Proceedings ISWC</source>
          <year>2012</year>
          , Boston, USA (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>A.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Draxler</surname>
            ,
            <given-names>R.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rolph</surname>
            ,
            <given-names>G.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stunder</surname>
            ,
            <given-names>B.J.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>M.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ngan</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>NOAA's HYSPLIT atmospheric transport and dispersion modeling system</article-title>
          .
          <source>Bulletin of the American Meteorological Society</source>
          <volume>96</volume>
          (
          <issue>12</issue>
          ) (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>