<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Online Integration of Spatial Reasoning in Complex Event Recognition</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Elias Alevizos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Georgios M. Santipantakis</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christos Doulkeridis</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Artikis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>NCSR “Demokritos”</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Piraeus</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Complex Event Recognition (CER) systems have the ability to process streams of events in real time by detecting event patterns with minimal latency. Typically, these patterns have a temporal structure, often resembling the sequential structure of regular expressions. A pattern advances to the next state by checking various conditions on the current state and possibly previous events of the stream. CER systems are very eficient in tracking all the possible paths that a pattern may follow (since these are often non-deterministic) and report when a path is complete and a complex event must be reported. However, the conditions that need to be checked may be very demanding. For example, in the maritime monitoring domain, a condition may need to check whether a vessel is close to any other vessel. Such conditions are not easily expressed directly as regular expressions. For such spatio-temporal tasks, there exist dedicated engines which can evaluate this type of conditions very eficiently. Thus, we can integrate such a spatial reasoning engine within a CER engine in order to take advantage of both worlds: the CER engine can accommodate and process complex regular expressions and delegate the evaluation of expensive spatial tasks to the dedicated spatial reasoning engine. In this paper, we present an approach towards such an integration. We show how a CER engine can take advantage of a spatial reasoning engine. We describe two diferent communication schemes between the CER engine and the spatial reasoning engine (blocking and lazy) and explore when each one should be preferred.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Finite Automata</kwd>
        <kwd>Regular Expressions</kwd>
        <kwd>Complex Event Recognition</kwd>
        <kwd>Complex Event Processing</kwd>
        <kwd>Symbolic Automata</kwd>
        <kwd>Variable-order Markov Models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Complex Event Recognition (CER) systems consume
streams of simple, input events in real time and produce
another stream of output, complex events [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. The
detection of these complex events is driven by a set of patterns,
defining relations among the input events. The patterns
determine how the input events must be ordered in time
for them to be considered a pattern match. Patterns are
typically defined through a temporal formalism. For
example, variations of regular expressions and automata are
often employed to express complex event patterns. Besides
temporal relations, a pattern may also define non-temporal
constraints on the input events, e.g., that two moving
objects are close in space. The input to a CER system thus
is a) a stream of simple, input events; b) a set of patterns
that define relations among the input events. Instances of
pattern satisfaction are called complex events. The output
of the system is another stream, composed of the detected
complex events. One main requirement for CER systems is
that they must detect complex events with very low latency,
which, in certain cases, may even be in the order of a few
milliseconds.
      </p>
      <p>
        As an example, consider the case of maritime monitoring,
which has gained considerable attention in the past yeas,
both for economic and for environmental reasons [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6">3, 4, 5, 6</xref>
        ].
Monitoring vessel activity is critical for preventing risks
and detecting suspicious or illegal behaviours, due to the
Automatic Identification System (AIS) 1. AIS is used to track
vessels at sea in real-time by allowing vessels to emit
information about their status (e.g., position and velocity) to
other vessels as well as to coastal stations. The system that
we present in this paper focuses on the domain of maritime
monitoring and its purpose is to inform potential analysts
about the behaviour of vessels at sea.
      </p>
      <p>
        In this context of real-time monitoring of moving vessels,
spatial reasoning is also important as it enables the
computation of complex relations between spatial entities [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. For
instance, the system needs to discover vessels that come close
to each other (possibly indicating a collision risk) or vessels
that sail very close to the coastline, to issue an alert. For this
purpose, in previous work, we have developed a prototype
for real-time spatial reasoning, called stLD [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which can
compute topological or proximity relations between various
types of spatial entities (points, lines, polygons) with low
latency. Moreover, stLD supports link discovery operations,
meaning that it can consume semantic representations of
data (using RDF), perform spatial reasoning and provide its
output as interlinked data.
      </p>
      <p>As described above, stLD performs spatio-temporal
reasoning on raw stream data. On the other hand, a CER engine
performs real-time reasoning at a higher abstraction level in
order to detect interesting activity patterns. Our intention
is to investigate how a CER engine could establish
communication with a spatial reasoning engine so that it can take
advantage of such an engine’s reasoning capabilities. Thus,
the CER engine will be able to focus more on performing
high-level temporal reasoning, whereas stLD can provide
support for lower-level spatial reasoning. The result will be
a high performance unified system that can perform
reasoning at various levels by taking into account and integrating
in real time any available background knowledge.</p>
      <p>The paper is structured as follows: Section 2 briefly
presents previous related work on spatio-temporal link
discovery and on CER enriched with spatio-temporal
capabilities. In Section 3 we give an overview of the CER and
stLD engines and describe how they function in isolation.
We then present how these two engines can cooperate and
communicate with each other in Section 4. In Section 5 we
present some experimental results and we conclude with
Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Link Discovery. Spatial link discovery or geospatial
interlinking has been studied in [
        <xref ref-type="bibr" rid="ref10 ref11 ref9">9, 10, 11</xref>
        ]. The objective is to
discover relations of spatial nature between two datasets in
an eficient way. Typically, most approaches in the literature
rely on blocking techniques that organize data in memory
(e.g., using space tiling), so as to quickly perform a filtering
step that produces a small number of candidate pairs. Then,
in the refinement step , these candidates are examined by
evaluating the spatial relation, in order to return the true results.
More recent approaches focus on progressive interlinking to
discover as many links as possible for a given budget [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ].
In the same line of work, geospatial interlinking assisted by
supervised learning has been studied in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Notably, there
is much less work for spatio-temporal link discovery [
        <xref ref-type="bibr" rid="ref15 ref8">8, 15</xref>
        ]
which involves the movement of objects, and for real-time
link discovery where the links need to be discovered over
streaming input sources.
      </p>
      <p>
        Complex Event Recognition. Complex Event
Recognition systems have been studied extensively in the past
decades (see [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] for reviews) and they come in various
colours and flavours. The overarching theme is the
requirement for real-time detection of complex temporal patterns
on high velocity and high volume streams of events. A
significant number of CER systems employ automata as
their computational model, whereas some other resort to
logic-based solutions or tree structures. A feature that is
generally lacking though is the ability of a CER engine to
eficiently take into account any background knowledge [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
e.g., bathymetry data in the maritime domain so as to avoid
possible grounding incidents. This is not knowledge that is
directly present in the payload of the input events, but may
be static knowledge residing in a (possible remote) database.
Moreover, there is the need to combine this static
knowledge with dynamic information, e.g., (static) bathymetry
data with (dynamic) position signals to determine whether
a vessel is in danger. In addition, we may also need to
extract complex spatio-temporal information, not necessarily
related to any static datasets, e.g., to check whether a vessel
is in close proximity to any other vessel at a given point or
interval in time. One solution to these issues is to initially
pre-process the stream of input events with a dedicated
module which can perform this type of knowledge integration.
We can then generate a new stream, enriched with
background knowledge. This enriched stream can then be used
to perform CER [
        <xref ref-type="bibr" rid="ref16 ref6">6, 16</xref>
        ]. However, it is not clear whether
such an approach could work eficiently in a proper
streaming environment, with events arriving in the system in real
time. Another approach is to have any background
knowledge reside in remote databases and access the required
information on a need-to-know basis, as in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. While this
is a method which can indeed work in real time, its
limitation is that it can work only with static knowledge, i.e.,
we can extract remote information and use it in our pattern
constraints, but we cannot perform any reasoning on that
information before it reaches the CER system.
      </p>
      <p>Our approach focuses on a scheme of labour division
between CER and specialized spatio-temporal processing.
While the CER system remains responsible for handling the
general temporal structure of a pattern, it delegates
expensive spatio-temporal tasks to a dedicated engine whenever
there is such a need. This engine has the ability to
provide both static information but can also perform complex
spatio-temporal reasoning tasks.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Background</title>
      <p>In this Section, we present the necessary background
information about the CER and stLD engines that we have used.
We first briefly describe the engines when they function in
isolation in order to aid understanding. In Section 4, we will
show how they can work in conjunction.</p>
      <sec id="sec-3-1">
        <title>3.1. Complex Event Recognition</title>
        <p>
          We begin by first presenting the CER engine we have
adopted. We have chosen to use Wayeb, a Complex Event
Recognition and Forecasting engine which employs
symbolic automata as its computational model [
          <xref ref-type="bibr" rid="ref18 ref19">18, 19</xref>
          ]. Wayeb
is both eficient and expressive, while maintaining clear,
compositional semantics for the patterns expressed in its
language due to the fact that symbolic automata have nice
closure properties [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. At the same time, it is expressive
enough to support most of the common CER operators [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>
          Wayeb’s patterns are expressed as symbolic regular
expressions, i.e., they are regular expressions with the
important diference that its terminal expressions are not simple
symbols from an alphabet, but Boolean expressions [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
Wayeb’s standard operators are those of the classical regular
expressions, i.e., concatenation, disjunction and Kleene-star.
Wayeb’s language has been extended to include various
extra CER operators, e.g., that of negation.
        </p>
        <p>Formally, symbolic regular expressions are defined as
follows:
Definition 1 (Symbolic regular expression). A Wayeb
symbolic regular expression (SRE ) is recursively defined as
follows:
• If  is a Boolean expression, then  :=  is a
symbolic regular expression, with ℒ( ) = J K, i.e., the
language of  is the subset of all possible elements /
simple events for which  evaluates to TRUE;
• Disjunction / Union: If 1 and 2 are symbolic
regular expressions, then  := 1 + 2 is also a symbolic
regular expression, with ℒ() = ℒ(1) ∪ ℒ(2);
• Concatenation / Sequence: If 1 and 2 are symbolic
regular expressions, then  := 1 · 2 is also a
symbolic regular expression, with ℒ() = ℒ(1) ·
ℒ(2), where · denotes concatenation. ℒ() is then
the set of all strings constructed from concatenating
each element of ℒ(1) with each element of ℒ(2);
• Iteration / Kleene-star: If  is a symbolic regular
expression, then ′ := * is a symbolic regular
expression, with ℒ(* ) = (ℒ())* , where ℒ* = ⋃︀ ℒ
≥ 0
and ℒ is the concatenation of ℒ with itself  times.
• Negation / complement: If  is a symbolic regular
expression, then ′ := ! is a symbolic regular
expression, with ℒ(′) = (ℒ()) (  stands for
complement).</p>
        <p>Wayeb patterns are defined as symbolic regular
expressions which are subsequently compiled into symbolic
automata. Symbolic automata resemble classical automata to
a large extent. The main diference is that their transitions,
instead of being labelled with a symbol from an alphabet,
are equipped with logical formulas in the form of Boolean
expressions. For a symbolic automaton to move to another
state, it first applies the Boolean expressions of its current
state’s outgoing transitions to the element last read from
78986
78986
78986
78986</p>
        <p>78986
3
2
14
4
speed &gt; 10</p>
        <p>
          speed &gt; 10
9
3
1
11
5
2
...
...
...
Definition 2 (Symbolic finite automaton [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]). A symbolic
ifnite automaton ( SFA) is a tuple  =(, ,  , ∆ ), where
 is a finite set of states;  ∈  is the initial state;  ⊆ 
is the set of final states; ∆ ⊆  × Ψ ×  is a finite set of
transitions. Ψ is the set of Boolean expressions that can be
constructed from a set of predicates with the standard Boolean
connectors, i.e., conjunction, disjunction and negation.
        </p>
        <p>A sequence  = 12 · · · , where  are simple events, is
accepted by a SFA  if, there exists a run (i.e., a sequence
of transitions) 0 →1 1 · · · − 1 →  · · · →  such that
0 =  and  ∈  . In other words, if the SFA reaches
a final state upon reading a sequence of events, then this
sequence is accepted by the SFA. The set of sequences
accepted by  is the language of  , denoted by ℒ( ).
We can now define “complex events”. A stream  is an
infinite sequence  = 1, 2, · · · , where each  is a simple
event. Our final goal is to report the indices  at which
a complex event is detected. If 1..= · · · , − 1,  is the
prefix of  up to the index , we say that an instance of
a SRE  is detected at  if there exists a sufix .. of
1.. such that .. ∈ ℒ().</p>
        <p>
          As an example, consider the domain of maritime
monitoring. An analyst could use the Wayeb language to define
the pattern  := ( &gt; 10) · ( &gt; 10) in order to
detect speed violations in certain designated areas where
the maximum allowed speed is 10 . This pattern
detects two consecutive events where the speed exceeds the
threshold in order to avoid cases where a vessel momentarily
exceeds the threshold, possibly due to some measurement
error. This is necessarily a simplified version of a speed
violation pattern in order to demonstrate in an accessible
manner the way Wayeb works. This pattern would be
compiled to the (non-deterministic) automaton of Figure 1. Table
1 shows an example stream processed by this automaton.
For the first three input events, the automaton would remain
in its start state, state 0. After the fourth event, it would
move to state 1 and after the fifth event it would reach its
ifnal state, state 2. We would thus say that a complex event
 was detected at timestamp = 5.
In previous work, the stLD framework has been proposed
to identify possibly complex spatial relations between data
from streaming sources [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. This is achieved using a
blocking mechanism that groups data objects, so as to allow quick
ifltering of only a handful of candidate pairs that need to
be examined during refinement for inclusion in the result.
Typically, the temporal dimension has a significant role in
spatiotemporal link discovery over streaming data sources.
For instance, in the context of maritime situational
awareness, it is useful to find vessels that come close to each other
during a specific temporal interval, thus indicating a
collision risk, rather than vessels that crossed the same area
irrespective of when.
        </p>
        <p>Since the spatial constraints need to be satisfied in
conjunction with temporal constraint for discovering a
spatiotemporal relation between entities, it follows that after a
period of time some data can no longer lead to the discovery
of new relations. Such data can be considered as obsolete
and they can be eliminated by stLD, which supports a sliding
window implementation. The sliding window has a fixed
temporal duration denfied at startup with respect to the
temporal threshold and relations that need to be evaluated.
The time window “slides” on the temporal dimension, as its
starting and ending boundaries are being updated by the
latest timestamp of the data received from the stream. Any
data past the sliding window can be safely ignored, because
they can no longer satisfy the temporal constraints of the
relations. The elimination of obsolete data decreases the
memory footprint and improves the overall performance,
since fewer data need to be considered during link
discovery. Within each window, the data objects are organized in
memory using the blocking and optimization methods that
are also available for the case of archival data sources.</p>
        <p>Figure 2 illustrates the concept of blocking applied by
stLD. A dataset reporting the position of vessels is denoted
by plus symbol, and requests are illustrated by X mark. The
center of the circles are the positions reported in the data or
requests, while the range of circles are equal to the distance
threshold used in proximity relations. In the illustrated
example, the requests in cells (1,1), (1,2), (1,3) and (2,1) will
be blocked in two cells. The requests in cells (1,2), (1,3), (2,1),
(2,2) and (3,3) are satisfied by positions reported in the data
set. Please notice that the requests in cells (1,1),(1,2),(1,3) and
(2,1) are blocked in more than one cells. Similarly, entities in
cells (1,3) and (3,2) are blocked in adjacent cells as well (since
their extruded positions overlap with more than one cells).
Any other geometry type (e.g., polygons) is also supported
by stLD, however it is beyond the scope of this paper.</p>
        <p>Some relations between entities may not be discovered by
considering only their reported positions, since the
intermediate between consecutive positions are not reported in the
data set. In this case, stLD constructs a line segment (i.e., a
trajectory part) between consecutive positions, enabling the
computation of intersecting points between trajectory parts
and/or regions of interest (indicating the entry/exit points
to a region of interest). Similarly, stLD employs bilinear
interpolation when processing data retrieved from raster
sources (e.g., bathymetry, weather forecast, etc). In this case
the estimated value is computed from all the known
neighboring positions of the request in the data set. In this work,
stLD employs the bilinear interpolation to compute the sea
depth at a requested position from a given binary (GRIB)
ifle.</p>
        <p>
          Finally, stLD applies an optimization technique called
MaskLink [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] that can be applied on any type of
geometry. MaskLink computes the area of each cell that is not
occupied by any entity, the so-called “mask” of the cell that
corresponds to empty space. More precisely, the mask is
defined appropriately for a given relation under study, so
as to ensure that any entity in the mask cannot be part the
relation. Thus, any entity of the streaming data set that is
found to be within the mask of a cell can be safely ignored
from the refinement step of the link discovery process. In
this way, MaskLink can restrict the number of candidates
and improve the performance of link discovery.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. CER powered with stLD</title>
      <p>The stLD engine performs semantic spatio-temporal
reasoning on raw stream data, whereas Wayeb performs real-time
reasoning at a higher abstraction level in order to detect
interesting activity patterns. Our intention is to investigate
how Wayeb could establish communication with the stLD
engine so that it can take advantage of such an engine’s
reasoning capabilities. Thus, Wayeb will be able to focus more
on performing high-level temporal reasoning, whereas the
stLD can provide support for lower-level spatio-temporal
reasoning. The result will be a high performance unified
system that can perform reasoning at various levels (see
Figure 3).</p>
      <sec id="sec-4-1">
        <title>4.1. Complexity of Event Recognition</title>
        <p>We first have a closer look at how Wayeb works internally.
The workflow is the following. The user provides a
pattern in the form of a SRE (symbolic regular expression)
and the engine compiles this pattern into a SFA  .
Subsequently it creates a streaming version of this SFA .
This streaming SFA is then fed with a stream  of
simple events. The version of Wayeb that we will be using
works with non-deterministic automata. Thus, for each
pattern (and automaton), Wayeb maintains a set of active
runs. This set is updated after every new event arrival and
is denoted by Run(, ..), where .. is the stream up to
index . Initially, before any input event has been consumed,
the set of runs Run(, ..0) is composed of a single run,
[1, .], i.e., the automaton’s head points to the first index
in the stream and the automaton is in its start state. The
engine then reads input events one by one and updates its
set of runs after every new event. At timepoint , before
reading , it maintains the set Run(, ..− 1). After the
event , it produces Run(, ..). This is achieved by
evaluating  against every  ∈ Run(, ..− 1). Each</p>
        <p>stLD
Complex</p>
        <p>Event
Recognition
System
Complex</p>
        <p>Event
Definitions</p>
        <p>Output
. . . . . .</p>
        <p>Complex Events
. . . . . .</p>
        <p>PATTERN SEQ(lowSpeedStart a, turn + b, lowSpeedEnd c)
WHERE skip-till-next-match
AND [vesselId]
AND b[i].heading−b[i−1].heading &gt; 90
AND InProtectedArea(b[i])</p>
        <p>WITHIN 21600
run  = [1, 1] →1 [2, 2] →2 · · ·  →−1 [, ] (where
  are transitions) has to evaluate  on all the outgoing
transitions of state . If no transition is triggered, this
means that the SFA cannot move to another state and
it is thus discarded and removed from the set of runs.
It will no longer be included in Run(, ..). If only
one transition is triggered, then  is updated, becoming
 = [1, 1] →1 [2, 2] →2 · · ·  →−1 [, ] → [ + 1, +1],
with a new state +1. If  transitions are triggered and
thus  next states are to be reached, then  may be updated
as usual for one of those next states. For each of the other
 − 1 next states,  is first cloned, producing  − 1 new
runs ′, ′′, etc. Then each of these runs is updated with
the new state and register contents
The updated/new runs are added to the set of runs
Run(, ..). Accepting runs are the exception here. If
+1 ∈ . for some run , then  reports that a
complex event has been detected and is then “killed”, i.e., not
added to Run(, ..). This process is repeated for the
remaining runs of Run(, ..− 1).</p>
        <p>There are two main factors that afect the performance
of the CER engine. The first such factor is the evaluation
of the Boolean expressions on the outgoing transitions of a
run’s current state. In cases where this evaluation has low
complexity (e.g., checking whether the value of the speed
attribute is above or below some threshold is an operation
with constant, very low complexity), then moving a run
to its next state(s) also becomes an operation which incurs
low latency. The second factor is the number of active runs
at each timepoint. Since each active run must be checked
against every new incoming event, a proliferation of runs
stLD
stLD
stLD
stLD
stLD
(a) Move past state 1, ignoring (b) May reach final state. We
Proximity. have a candidate match.</p>
        <p>&gt;
start 0 GT
&gt;
3 start 0 GT
1 GT ∧ Prox 2 GT
1 GT ∧ Prox 2 GT
(c) Now send a request to stLD. (d) If reply is true, convert the
candidate match into a real
match.
stLD
may have a significant impact on the latency of processing
input events.</p>
        <p>Our goal is to address the complexity issues arising from
the first of the above factors. The reason is the following.
In the maritime domain, it is often the case that an event
pattern may be required to perform complex spatiotemporal
tasks. For example, instead of checking simply for the speed
of a vessel, it may be required to check whether a vessel
is near any other vessel with a certain given time window.
Such relations between vessels (or vessels and areas) are
generally expensive to compute and would incur a performance
penalty on the CER engine. On the other hand, they may
be eficiently computed by specialized, optimized modules,
such as the stLD. Thus, the opportunity arises for the CER
engine to delegate the evaluation of these relations to the
stLD. In order to do this, though, we need to ensure that the
CER and stLD modules are able to communicate eficiently
and that the communication cost is actually lower than the
cost of evaluating such relations locally.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Communication between CER and stLD</title>
        <p>The basic way for establishing a communication link
between the CER and stLD engines is through a blocking
scheme, as shown in Figure 4. GT stands for Greater Than
and is a simple threshold predicate, determining whether
the speed of the monitored vessel exceeds some given (not
shown here) threshold. Prox stands for Proximity and
determines whether the monitored vessel is close to any other
vessel. GT is a simple predicate and can be directly
evaluated by the CER engine, where Prox is substantially more
complex (we need to locate all possible neighbouring vessels
within certain spatial and temporal windows) and is handled
by the stLD engine. When an automaton run reaches a state,
e.g., state 1 in Figure 4, whose outgoing transitions contain
a spatial predicate, e.g., predicate Prox in Figure 4, then the
CER engine sends a relevant request to stLD and blocks,
waiting for a reply. As soon as the reply reaches back to
the CER engine, the run may continue and determine its
next state. In cases where a predicate is expensive, the CER
engine sits idle waiting for the reply, despite the fact that
could process other runs in the meantime (of the same or
even of other patterns).</p>
        <p>An alternative approach would be to employ an optimistic,
lazy strategy. Whenever a run reaches a state which needs
to communicate with the stLD, it can send a request but
avoid waiting for the reply, as shown in Figure 5. Instead,
we can make the optimistic assumption that the reply to
the request is TRUE and move the run forward according to
this assumption. For example, in Figure 5 we may assume
that Prox=TRUE and, if GT=TRUE as well, jump from state
1 to state 2. From state 1 we send an initial “request”, but
we do not block and wait for the reply. The stLD and CER
engines can then work in parallel. While the stLD engine
evaluates a relation, the CER can still keep working on its
runs. However, in this case, each run will also have a “debt”
that it will need to pay at some point in the future. This debt
corresponds to our optimistic assumption about the spatial
predicate being true. For each such request sent to stLD, a
run carries a corresponding debt, in the sense that, if the run
actually reaches its final state at some point, then it will have
to examine whether its optimistic assumption was actually
sound. For example, if the run of Figure 5 reaches state 3,
we cannot immediately determine whether this is an actual
match (complex event). We first need to check whether the
Prox request sent from state 1 was actually TRUE. For this
reason, when a run reaches a final state, it tries to repay
its debt. For each stLD request it had sent, it now sends an
equivalent “retrieve” message. If the stLD engine has had
enough time to evaluate the request, it will respond with
the actual reply (which it holds in a special data structure).
Otherwise, it sends a “NA” (not available) response. Finally,
the CER engine checks whether its optimistic assumptions
were actually valid, according to the actual responses of
stLD. If they were, Wayeb validates and commits the run
as an actual match. If any of the assumptions were wrong,
the run is discarded. If some assumptions were correct and
some replies were not yet availabe, the run repays the debt
corresponding to the correct assumptions and retains the
remaining debt. It then retries to repay this debt whenever
a new event arrives.</p>
        <p>The power of the lazy scheme lies in the fact that it
allows the CER engine to keep working on runs while the
stLD evaluates any received predicates. Thus, it ofers a
kind of parallel execution. The downside of this scheme
is that it introduces a communication cost. Compared to
the blocking scheme, each predicate evaluation requires at
least an extra “retrieve” messages (besides the “request” and
“reply” messages) and possibly multiple such messages in
cases where the stLD engine replies with “NA”. Thus, it is
possible that the lazy scheme might not always achieve a
higher performance than the blocking one. The expectation
is that it would be beneficial in cases where the cost of
evaluating the remote predicate is significantly higher than the
communication cost.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Empirical Evaluation</title>
      <p>In this Section, we present relevant experimental results. We
ifrst test the stLD and CER modules independently, while
working in standalone mode. Subsequently, we present
results showing their combined performance.</p>
      <sec id="sec-5-1">
        <title>5.1. stLD standalone</title>
        <p>
          In this work, Wayeb depends on the spatiotemporal
relations computed by stLD. We evaluate the performance of
stLD on a 8-core QEMU Virtual CPU at 2.1GHz with 5GB
dedicated memory to the Java VM. We use diferent
distance and temporal thresholds for the proximity relation
for each experiment, aiming to the computation of diferent
number of relations. For each experiment stLD computes
the sea depth at each reported position and the proximity
relations w.r.t. the distance and temporal thresholds. We
report the results for the set of experiments using the distance
( ) and temporal (  ) thresholds  ():{10, 100,
500, 1000}×   ():{60, 600, 1200} in Table 2, where
the processing time and computed number of relations are
reported in columns (per temporal threshold) for each
distance threshold (in rows) evaluated. For example, the top
left cells of the table, report that 122,518 proximity relations
can be computed over the complete data set in 563,551 msec,
using a distance threshold of 10 meters and temporal
threshold of 60 seconds. The complete data covers a temporal
interval of 4392 hours (183 days) and it is processed in 562
seconds (as an average of processing time in the results).
A rough estimation of the processing time per relation is
0.128 msec per relation, which is computed from the
average processing time over the average number of relations
computed.
An important feature of Wayeb is that it can detect patterns
with very low latency. It can thus scale gracefully with
increased loads. We first tested Wayeb’s scalability in the
maritime domain as a standalone component. Specifically, we
used a real–world dataset composed of a set of trajectories
from ships sailing at sea, emitting AIS (Automatic
Identification System) messages that relay information about their
position, heading, speed, etc. The dataset that we used is
publicly available, contains AIS kinematic messages from
vessels sailing in the Atlantic Ocean around the port of Brest,
France, and spans a period from 1 October 2015 to 31 March
2016. The total number of AIS messages is 18,648,556 from
approximately 5,000 vessels [
          <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
          ].
        </p>
        <p>As a test pattern, we used one that detects speed
violations, i.e., consecutive messages where a vessel’s speed
exceeds some given threshold, as in Figure 1. This
pattern is applied on a per-vessel basis, i.e., each vessel has
its own “copy” of the pattern. We started by running this
single pattern on the dataset. In order to increase the load
of the engine, we then started increasing the number of
patterns (by copying the original pattern multiple times).
Note that each pattern (or copy thereof) is applied on all
vessels. The number of running automata is thus equal to
the number of pattern copies multiplied by the number of
vessels. The throughput starts from approximately 1.000.000
events/second for a single pattern and decreases as the
number of patterns increases. In Figure 6 we show results for
the scalability test. Instead of showing directly throughput
numbers, we show the degree to which our system is
better than the real-time requirements of the problem. The
x axis corresponds to the number of patterns. The y axis
corresponds to the ratio of execution over real time. The
execution time is the total time that the engine required
to process the whole dataset. By real time, we refer to the
total time that elapses in the real world for the dataset to be
produced, i.e., the diference between the timestamp of the
last event in the dataset and the timestamp of the first event.
A value of 1.0 for this ratio would mean that the engine
can process events at the rate that these are produced. Any
value above 1.0 would mean that the engine lags behind the
input event and cannot process them in time. Any value
below 1.0 would mean that the engine can cope with the
event rate. Note that Wayeb, in all cases, can process the
dataset at a rate that is orders of magnitude greater than
the input event rate. As expected, the ratio increases as
the number of patterns increases as well. However, it is
always below 1%, even with 100 patterns. This indicates
that Wayeb can handle real-world maritime patterns with
exceptional eficiency, even for increased workloads.</p>
        <p>Subsequently, we tested Wayeb against another type of
workload variation: the input event rate. We applied an
interpolation scheme on the initial dataset to produce
derivative datasets where the interval between any two
consecutive position signals of a vessel is fixed. We created datasets
where the interpolation interval is 1, 5, 10, 15, 30 seconds,
corresponding to a frequency of 60, 12, 6, 4, 2 events per
minute for each vessel. The relevant results are shown in
We then compared the performance of the lazy versus the
blocking scheme for various values of the communication
and evaluation cost/delay. We used a pattern detecting a
sequence of AIS messages where a vessel has a high speed,
and it is close to another vessel:
 :=  ·  ·  WHERE</p>
        <p>GT (x , speed , 5.0) AND
(GT (y , speed , 5.0) ∧ ProximityRemote (y )) AND
GT (z , speed , 5.0)</p>
        <p>PARTITION BY vesselId
The pattern is a sequence of three events. The constraint
for the first and last events is the same: the speed must be
greater than 5 knots. For the middle event, there is the extra
constraint that the vessel must be in close proximity to at
least another vessel. The sufix Remote indicates that the
proximity predicate is evaluated by the stLD engine.</p>
        <p>For a systematic evaluation, we simulated the evaluation
(of stLD) and communication delays. We also used a
parameter to simulate the probability of stLD evaluating a
predicate as TRUE. Figure 8 shows throughput as a
function of communication and evaluation delay. It also shows
the maximum possible throughput with blocking, assuming
that the latency of Wayeb is 0 (displayed as max). Figure
9 shows the throughput increase of lazy communication
over blocking. We can see that lazy is almost always better,
except for very low (evaluation and communication) delay
values. Figure 10 shows again throughput as a function of
communication and evaluation delay. However, this time
we also run experiments for diferent values of satisfaction
probability of the proximity predicate. The lower the value
of this satisfaction probability the fewer the time that the
stLD will reply with TRUE to a request from the CEP engine.</p>
        <p>In Figure 8, the drop in throughput from the max curve to
the blocking curve for a given combination of delay values
corresponds to the part of throughput that we lose due to
the delay of Wayeb. Note that throughput is estimated as:
  
  +  +</p>
        <p>For example, for the x point at 1k-2k, the max throughput is
350.000 events/sec and the blocking throughput is 150.000
events/sec. Wayeb thus costs us 200.000 events/sec. Notice
that the max and blocking curves start to coincide after a
certain point. This means that the total delay (the sum of
communication, stLD and Wayeb delays) is dominated by
com-eval delays after that point and the delay of Wayeb is
minimal compared to the com-eval delays. The
denominators in the definition of throughput for both cases tend to
become equal, which can happen only if WayebDelay is
very small compared to the other two components
(remember that WayebDelay = 0 for the max case). Finally, it is
interesting to note that lazy does not seem to be significantly
afected by the evaluation delay.</p>
        <p>We can also see in Figure 9 that throughput increase of
lazy over blocking may reach values of 200%. As initially
suspected, the lazy scheme is most beneficial when the
evaluation delay is large relative to the communication delay.
Since the lazy scheme sufers from a higher communication
cost, we need the evaluation cost to be relatively large.
Otherwise, any benefits gained from our optimistic lazy scheme
would be ofset by the communication cost.</p>
        <p>Finally in Figure 10 we see that the lazy scheme is not as
much afected by the satisfaction probability as the blocking
scheme. We also see that, as we decrease the satisfaction
probability, the blocking scheme turns out to be better than
the lazy for a wider range of delay values. The reason is
that a lower satisfaction probability means that more runs
are killed because of more FALSE replies from stLD. As a
result, the lazy scheme advances more runs that it should
not have. Therefore, we pay a higher overhead cost.
We presented an approach for integrating spatio-temporal
background knowledge within a CER system. We
demonstrated how a CER engine can take advantage of a
dedicated spatio-temporal module in order to leverage its
capabilities and maintain high throughput values even in the
presence of computationally demanding spatio-temporal
constraints. We described two diferent communication
schemes between the CER engine and the spatio-temporal
module: blocking and lazy. Our results showed that the lazy
scheme is preferable in most cases. However, there are a few
cases where the cost of maintaining an increased number
of pattern runs is greater than any benefits gained from
the lazy scheme and the blocking one turns out to be more
advantageous.</p>
        <p>As future work, we intend to explore more
communication schemes. For example, currently, we wait until the
automaton has reached a final state and then start sending
retrieve messages to stLD. But we could do this in earlier
states and thus limit the number of irrelevant runs.
Alternatively, stLD could assume a more proactive role and send
notifications to CER. Then CER could consume replies
“immediately” and try to pay debts. Communication schemes
which attempt to prefetch any relevant information could be
of interest as well. We also intend to automate the process of
selecting the proper communication scheme. In each state,
an automated system should be able to decide whether to,
a) prefetch any relevant information; b) block and evaluate
predicates; c) postpone, assuming the predicates are TRUE;
d) or even postpone, assuming the predicate is FALSE, which
implies that in cases where the predicate is evaluated finally
as TRUE, we could possibly miss some complex events.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgment</title>
      <p>This work was supported by the VesselAI project, under EU
H2020 grant agreement No 957237.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Giatrakos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Alevizos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Artikis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Deligiannakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. N.</given-names>
            <surname>Garofalakis</surname>
          </string-name>
          ,
          <article-title>Complex event recognition in the big data era: a survey</article-title>
          ,
          <source>VLDB J</source>
          .
          <volume>29</volume>
          (
          <year>2020</year>
          )
          <fpage>313</fpage>
          -
          <lpage>352</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Cugola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Margara</surname>
          </string-name>
          ,
          <article-title>Processing flows of information: From data stream to complex event processing</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>44</volume>
          (
          <year>2012</year>
          )
          <volume>15</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          :
          <fpage>62</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B.</given-names>
            <surname>Idiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>The automatic identification system of maritime accident risk using rule-based reasoning</article-title>
          , in: SoSE, IEEE,
          <year>2012</year>
          , pp.
          <fpage>125</fpage>
          -
          <lpage>130</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Terroso-Saenz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Valdés-Vela</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. F.</given-names>
            <surname>SkarmetaGómez</surname>
          </string-name>
          ,
          <article-title>A complex event processing approach to detect abnormal behaviours in the marine environment</article-title>
          ,
          <source>Inf. Syst. Frontiers</source>
          <volume>18</volume>
          (
          <year>2016</year>
          )
          <fpage>765</fpage>
          -
          <lpage>780</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Snidaro</surname>
          </string-name>
          , I. Visentini,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bryan</surname>
          </string-name>
          ,
          <article-title>Fusing uncertain knowledge and evidence for maritime situational awareness via markov logic networks</article-title>
          ,
          <source>Inf. Fusion</source>
          <volume>21</volume>
          (
          <year>2015</year>
          )
          <fpage>159</fpage>
          -
          <lpage>172</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>K.</given-names>
            <surname>Patroumpas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Alevizos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Artikis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Vodas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Pelekis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Theodoridis</surname>
          </string-name>
          ,
          <article-title>Online event recognition from moving vessel trajectories</article-title>
          ,
          <source>GeoInformatica</source>
          <volume>21</volume>
          (
          <year>2017</year>
          )
          <fpage>389</fpage>
          -
          <lpage>427</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Santipantakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vlachou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Doulkeridis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Artikis</surname>
          </string-name>
          , I. Kontopoulos,
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Vouros</surname>
          </string-name>
          ,
          <article-title>A stream reasoning system for maritime monitoring</article-title>
          ,
          <source>in: TIME</source>
          , volume
          <volume>120</volume>
          of LIPIcs,
          <year>2018</year>
          , pp.
          <volume>20</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          :
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Santipantakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Glenis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Doulkeridis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vlachou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Vouros</surname>
          </string-name>
          ,
          <article-title>stLD: towards a spatio-temporal link discovery framework</article-title>
          , in: SBD@SIGMOD, ACM,
          <year>2019</year>
          , pp.
          <volume>4</volume>
          :
          <fpage>1</fpage>
          -
          <issue>4</issue>
          :
          <fpage>6</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Sherif</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Dreßler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Smeros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Ngomo</surname>
          </string-name>
          ,
          <article-title>Radon - rapid discovery of topological relations</article-title>
          , in: AAAI, AAAI Press,
          <year>2017</year>
          , pp.
          <fpage>175</fpage>
          -
          <lpage>181</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Ngomo</surname>
          </string-name>
          , ORCHID - reduction
          <article-title>-ratio-optimal computation of geo-spatial distances for link discovery</article-title>
          ,
          <source>in: ISWC</source>
          , volume
          <volume>8218</volume>
          , Springer,
          <year>2013</year>
          , pp.
          <fpage>395</fpage>
          -
          <lpage>410</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A. F.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Sherif</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Ngomo</surname>
          </string-name>
          ,
          <article-title>On the efect of geometries simplification on geo-spatial link discovery</article-title>
          ,
          <source>in: SEMANTiCS</source>
          , volume
          <volume>137</volume>
          of Procedia Computer Science, Elsevier,
          <year>2018</year>
          , pp.
          <fpage>139</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Papadakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Mandilaras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Mamoulis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          , Progressive, holistic geospatial interlinking,
          <source>in: WWW, ACM / IW3C2</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>833</fpage>
          -
          <lpage>844</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Papadakis</surname>
          </string-name>
          , G. Mandilaras,
          <string-name>
            <given-names>N.</given-names>
            <surname>Mamoulis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          ,
          <article-title>Static and dynamic progressive geospatial interlinking</article-title>
          ,
          <source>ACM Trans. Spatial Algorithms Syst</source>
          .
          <volume>8</volume>
          (
          <issue>2022</issue>
          )
          <fpage>1</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>M. D. Siampou</surname>
            , G. Papadakis,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Mamoulis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Koubarakis</surname>
          </string-name>
          ,
          <article-title>Supervised scheduling for geospatial interlinking</article-title>
          , in: SIGSPATIAL, ACM,
          <year>2023</year>
          , pp.
          <volume>42</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>42</lpage>
          :
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Smeros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          ,
          <article-title>Discovering spatial and temporal links among RDF data</article-title>
          ,
          <source>in: LDOW@WWW</source>
          , volume
          <volume>1593</volume>
          <source>of CEUR Workshop Proceedings</source>
          , CEURWS.org,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Pitsikalis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Artikis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dreo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Camossi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jousselme</surname>
          </string-name>
          ,
          <article-title>Composite event recognition for maritime monitoring</article-title>
          , in: DEBS, ACM,
          <year>2019</year>
          , pp.
          <fpage>163</fpage>
          -
          <lpage>174</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhao</surname>
          </string-name>
          , H. van der Aa, T. T. Nguyen,
          <string-name>
            <given-names>Q. V. H.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Weidlich, EIRES: eficient integration of remote data in event stream processing</article-title>
          , in: SIGMOD Conference, ACM,
          <year>2021</year>
          , pp.
          <fpage>2128</fpage>
          -
          <lpage>2141</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>E.</given-names>
            <surname>Alevizos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Artikis</surname>
          </string-name>
          , G. Paliouras,
          <article-title>Complex event forecasting with prediction sufix trees</article-title>
          ,
          <source>VLDB J</source>
          .
          <volume>31</volume>
          (
          <year>2022</year>
          )
          <fpage>157</fpage>
          -
          <lpage>180</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>E.</given-names>
            <surname>Alevizos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Artikis</surname>
          </string-name>
          , G. Paliouras,
          <article-title>Wayeb: a tool for complex event forecasting</article-title>
          ,
          <source>in: LPAR</source>
          , volume
          <volume>57</volume>
          of EPiC Series in Computing, EasyChair,
          <year>2018</year>
          , pp.
          <fpage>26</fpage>
          -
          <lpage>35</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>L. D'Antoni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Veanes</surname>
          </string-name>
          ,
          <article-title>The power of symbolic automata and transducers</article-title>
          ,
          <source>in: CAV (1)</source>
          , volume
          <volume>10426</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2017</year>
          , pp.
          <fpage>47</fpage>
          -
          <lpage>67</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Santipantakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Doulkeridis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Vouros</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Vlachou,
          <article-title>MaskLink: Eficient link discovery for spatial relations via masking areas</article-title>
          , CoRR abs/
          <year>1803</year>
          .01135 (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>C.</given-names>
            <surname>Ray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dreo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Camossi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jousselme</surname>
          </string-name>
          ,
          <article-title>Heterogeneous Integrated Dataset for Maritime Intelligence</article-title>
          , Surveillance, and Reconnaissance,
          <volume>10</volume>
          .5281/zenodo.1167595,
          <year>2018</year>
          . URL: https://doi.org/10.5281/ zenodo.1167595. doi:
          <volume>10</volume>
          .5281/zenodo.1167595.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>K.</given-names>
            <surname>Bereta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chatzikokolakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zissis</surname>
          </string-name>
          ,
          <article-title>Maritime reporting systems</article-title>
          , in: Guide to Maritime Informatics, Springer,
          <year>2021</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>