<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Distributed Online Learning Approach for Patern Prediction over Movement Event Streams with Apache Flink</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ehab Qadah</string-name>
          <email>Ehab.Qadah@iais.fraunhofer.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elias Alevizos</string-name>
          <email>alevizos.elias@iit.demokritos.gr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Mock</string-name>
          <email>Michael.Mock@iais.fraunhofer.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Georg Fuchs</string-name>
          <email>Georg.Fuchs@iais.fraunhofer.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Fraunhofer IAIS</institution>
          ,
          <addr-line>Sankt Augustin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>NCSR "Demokritos"</institution>
          ,
          <addr-line>Athens</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <fpage>109</fpage>
      <lpage>116</lpage>
      <abstract>
        <p>In this paper, we present a distributed online prediction system for user-defined patterns over multiple massive streams of movement events, built using the general purpose stream processing framework Apache Flink. The proposed approach is based on combining probabilistic event pattern prediction models on multiple predictor nodes with a distributed online learning protocol in order to continuously learn the parameters of a global prediction model and share them among the predictors in a communicationeficient way. Our approach enables the collaborative learning between the predictors (i.e., "learn from each other"), thus the learning rate is accelerated with less data for each predictor. The underlying model provides online predictions about when a pattern (i.e., a regular expression over the event types) is expected to be completed within each event stream. We describe the distributed architecture of the proposed system, its implementation in Flink, and present experimental results over real-world event streams related to trajectories of moving vessels.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        In recent years, technological advances have led to a growing
availability of massive amounts of continuous streaming data
(i.e., data streams observing events) in many application domains
such as social networks [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], Internet of Things (IoT) [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], and
maritime surveillance [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ]. The ability to detect and predict the
full matches of a pattern of interest (e.g., a certain sequence of
events), defined by a domain expert, is typically important for
operational decision making tasks in the respective domains.
      </p>
      <p>
        An event stream is an unbounded collection of time-ordered
data observations in the form of a tuple of attributes that is
composed of a value from finite event types along with other
categorical and numerical attributes. In this work, we deal with
movement event streams. For instance, in the context of
maritime surveillance the event stream of a moving vessel consists
of spatio-temporal and kinematic information along with the
vessel’s identification and its trajectory related events, based
on the automatic identification system (AIS) [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] messages that
are continuously sent by the vessel. Therefore, leveraging event
patterns prediction over real-time streams of moving vessels is
useful to alert maritime operation managers about suspicious
activities (e.g., fast sailing vessels near ports, or illegal fishing)
before they happen. However, processing real-time streaming
      </p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK AND BACKGROUND</title>
    </sec>
    <sec id="sec-3">
      <title>Related work</title>
      <p>
        Patern prediction over event streams. The task of forecasting
over time-evolving streams of data can be formulated in various
ways and with varying assumptions. One common way to
formalize this task is to assume that the stream is a time-series of
numerical values, and the goal is to forecast at each time point
n the values at some future points n + 1, n + 2, etc., (or even
the output of some function of future values). This is the task of
time-series forecasting [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. Another way to formalize this task is
to view streams as sequences of events, i.e., tuples with multiple,
possibly categorical, attributes, like event type, timestamp, etc.,
and the goal is to predict future events or patterns of events. In
this paper, we focus on this latter definition of forecasting (event
pattern forecasting).
      </p>
      <p>
        A substantial body of work on event forecasting comes from
the field of temporal pattern mining where events are defined as
2-tuples of the form (EventType, Timestamp). The ultimate goal
is to extract patterns of events in the form either of association
rules [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] or frequent episode rules [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. These methods have been
extended in order to be able to learn not only rules for detecting
event patterns but also rules for predicting events. For example,
in [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ], a variant of association rule mining is where the goal is
to extract sets of event types that frequently lead to a rare, target
event within a temporal window.
1http://www.datacron-project.eu/
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], a probabilistic model is presented for calculating the
probability of the immediately next event in the stream. This
is achieved by using standard frequent episode discovery
algorithms and combining them with Hidden Markov Models and
mixture models. The framework of episode rules is employed
in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] as well. The output of the proposed algorithms is a set
of predictive rules whose antecedent is minimal (in number of
events) and temporally distant from the consequent. In [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] a
set of algorithms is proposed that target batch online mining of
sequential patterns, without maintaining exact frequency counts.
As the stream is consumed, the learned patterns can be used to
test whether a prefix matches the last events seen in the stream,
indicating a possibility of occurrence for events that belong to
the sufix of the rule.
      </p>
      <p>
        Event forecasting has also attracted some attention from the
ifled of Complex Event Processing (see [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for a review of
Complex Event Processing). One such early approach is presented
in [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. Complex event patterns are converted to automata, and
subsequently, Markov chains are used in order to estimate when
a pattern is expected to be fully matched. A similar approach
is presented in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], where again automata and Markov chains
are employed in order to provide (future) time intervals during
which a match is expected with a probability above a confidence
threshold.
      </p>
      <p>
        Distributed Online Learning. In recent years, the problem of
distributed online learning has received increased attention and
has been studied in [
        <xref ref-type="bibr" rid="ref16 ref18 ref33 ref34 ref8">8, 16, 18, 33, 34</xref>
        ]. A distributed online
minibatch prediction approach over multiple data streams has been
proposed in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. This approach is based on a static
synchronization method. The learners periodically communicate their local
models with a central coordinator unit after consuming a fixed
number of input samples/events (i.e., batch size b), in order to
create a global model and share it between all learners. This work
has been extended in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] by introducing a dynamic
synchronization scheme that reduces the required communication overhead.
It can do so by making the local learners communicate their
models only if they diverge from a reference model. In this work, we
employ this protocol with event patterns prediction models over
multiple event streams.
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Technological Background</title>
      <p>
        In the last years, many systems for large-scale and distributed
stream processing have been proposed, including Spark
Streaming [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], Apache Storm [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and Apache Flink [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. These
frameworks can ingest and process real-time data streams, published
from diferent distributed message queuing platforms, such as
Apache Kafka [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] or Amazon Kinesis [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In this work, we
implemented our system in Flink and Kafka (see Section 4).
      </p>
      <p>
        In the datAcron project, the Flink streaming processing engine
has been chosen as a primary platform for supporting the
streaming operations, based on an internal comparative evaluation of
several streaming platforms. Hence, we used it to implement our
system. A predecessor distributed online learning framework has
already been implemented in the FERARI project [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] based on
Apache Storm.
      </p>
      <p>
        Apache Flink. Apache Flink is an open source project that
provides a large-scale, distributed, and stateful stream processing
platform [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Flink is one of the most recent and pioneering Big
Data processing frameworks. It provides processing models for
both streaming and batch data, where the batch processing model
is treated as a special case of the streaming one (i.e., finite stream).
Flink’s software stack includes the DataStream and DataSet APIs
for processing infinite and finite data, respectively. These two
core APIs are built on top of Flink’s core dataflow engine and
provide operations on data streams or sets such as mapping,
ifltering, grouping, etc.
      </p>
      <p>
        The two main data abstractions of Flink are DataStream and
DataSet, they represent read-only collections of data elements.
The list of elements is bounded (i.e., finite) in DataSet, while it is
unbounded (i.e., infinite) in the case of DataStream. Flink’s core
is a distributed streaming dataflow engine. Each Flink program
is represented by a data-flow graph (i.e., directed acyclic graph
DAG) that gets executed by Flink’s dataflow engine [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The data
lfow graphs are composed of stateful operators and intermediate
data stream partitions. The execution of each operator is handled
by multiple parallel instances whose number is determined by
the parallelism level. Each parallel operator instance is executed
in an independent task slot on a machine within a cluster of
computers [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        Apache Kafka. Apache Kafka is a scalable, fault-tolerant, and
distributed streaming framework/messaging system [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. It
allows to publish and subscribe to arbitrary data streams, which
are managed in diferent categories (i.e., topics) and partitioned
in the Kafka cluster. The Kafka Producer API provides the
ability to publish a stream of messages to a topic. These messages
can then be consumed by applications, using the Consumer API
that allows them to read the published data stream in the Kafka
cluster. In addition, the streams of messages are distributed and
load balanced between the multiple receivers within the same
consumer group for the sake of scalability.
      </p>
      <p>Definition 3.1. Each event is defined as a tuple of attributes
ei = (id, type, τ , a1, a2....., an ), where type is the event type
attribute that takes a value from a set of finite event types/symbols
Σ, τ represents the time when the event tuple was created, the
a1, a2, ..., an are spatial or other contextual features (e.g., speed);
these features are varying from one application domain to
another. The attribute id is a unique identifier that connects the
event tuple to an associated domain object.</p>
      <p>Definition 3.2. A stream s = ⟨e1, e3, ..., et , ...⟩ is a time-ordered
sequence of events.</p>
      <p>
        A user-defined pattern P is given in the form of a regular
expression (i.e., using operators for sequence, disjunction, and
iteration) over Σ (i.e., event types) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. More formally, a pattern
is given through the following grammar:
      </p>
      <p>Definition 3.3. P := E | P1; P2 |P1 ∨ P2 | P1∗, where E ∈ Σ is
a constant event type. ; stands for sequence, ∨ for disjunction
and ∗ for Kleene − ∗. The pattern P := E is matched by reading
an event ei if ei .type = E. The other cases are matched as in
standard automata theory.</p>
      <p>The problem at hand may then be stated as follows: given
a stream s of low-level events and a pattern P, the goal is to
estimate at each new event arrival the number of future events
that we will need to wait for until the pattern is satisfied (and
therefore a full match is detected).</p>
      <p>
        3.1.2 Proposed approach. As a first step, event patterns are
converted to deterministic finite automata (DFA) through
standard conversion algorithms [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. As an example, see Figure 1a for
the DFA of the simple sequential pattern P = a; d; c and an
alphabet Σ = {a, b, c, d } (note that the DFA has no dead states since
we need to handle streams and not strings). The next step is to
derive a Markov chain that will be able to provide a probabilistic
description of the DFA’s run-time behavior.
      </p>
      <p>
        Towards this goal, we use Pattern Markov Chains, as was
proposed in [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. Under the assumption that the input stream is
generated by an m-order Markov source, and after performing a
transformation step on the initial DFA to handle the mth order
(see [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] for more details). It can be shown that there is a direct
mapping of the states of the DFA to states of a Markov chain and
the transitions of the DFA to transitions of the Markov chain.
The transition probabilities are then conditional probabilities on
the event types.
      </p>
      <p>We call such a derived Markov chain a Pattern Markov Chain
(PMC) of order m and denote by PMCm, where P is the initial
P
pattern and m the assumed order. As an example, see Figure 1b,
which depicts the PMC of order 1 for the generated DFA of
Figure 1a.</p>
      <p>After constructing a PMC, we can use it to calculate the
socalled waiting-time distributions. Given a specific state of the
PMC, a waiting-time distribution gives us the probability of
reaching a set of absorbing states in n transition from now (absorbing
states are states with self-loops and probability equal to 1.0). By
mapping the final states of the initial DFA to absorbing states
of the PMC (see again Figure 1). Therefore, we can calculate the
probability of reaching a final state, or, in other words, of
detecting a full match of the original regular expression in n events
from now.</p>
      <p>In order to estimate the final forecasts, another step is required,
since our aim is not to provide a single future point with the
highest probability but an interval. Predictions are given in the form
of intervals, as I = (start, end). The meaning of such an interval
is that the DFA is expected to reach a final state sometime in the
future between the start and end with probability at least some
constant threshold θf c (provided by the user). These intervals are
estimated by a single-pass algorithm that scans a waiting-time
distribution and finds the smallest (in terms of length) interval
that has probability exceeds this threshold (θf c ). For example,
Figure 2a shows the waiting-time distributions for the non-final
states of the DFA in Figure 1, and the computed prediction
intervals are depicted in Figure 2b.</p>
      <p>The method described above assumes that we know the
(possibly conditional) occurrence probabilities of the various event
types appearing in a stream (as would be the case with
synthetically generated streams). However, this is not always the case in
real-world situations. Therefore, it is crucial for a system
implementing this method to have the capability to learn the values of
the PMC’s transition matrix. One way to do this is to use some
start</p>
      <p>0
P (a|a)</p>
      <p>1
P (c |a)
P (c |c)
3
a
c
d
a
b
c
a
a
2
d
b
b
b
4
b
d
c
1
a
c
3
b
bd
c d
d
c
(a) DFAΣ∗;P
P (d |a)</p>
      <p>P (a|d)</p>
      <p>P (a|b) P (b |b) P (b |d)</p>
      <p>P (b |a)
P (a|c)</p>
      <p>P (c |b)</p>
      <p>P (b |c)</p>
      <p>2
P (a|d)
P (d |c)
P (c |d)
(b) PMCP1
a
5
c
6
a</p>
      <p>d
5
6
1
4</p>
      <p>P (d |d)</p>
      <p>P (c |d)
P (d |b)</p>
      <p>P (b |d)</p>
      <p>
        P (d |d)
(a) Waiting-time distribution.
(b) Prediction intervals.
part of the stream to obtain the maximum-likelihood estimators
for the transition probabilities [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. If Π is the transition matrix of
a Markov chain with a set of states Q, πi, j the transition
probability from state i to state j, ni, j the number of observed transitions
from state i to state j, then the maximum likelihood estimator
for πi, j is given by:
πˆi, j = Í
      </p>
      <p>ni, j
k ∈Q ni,k
=
ni, j
ni
Executing this learning step on a single node might require a
vast amount of time until we arrive at a suficiently good model.
In this paper, we present a distributed method for learning the
transition probability matrix.
3.2</p>
    </sec>
    <sec id="sec-5">
      <title>Pattern prediction on multiple streams</title>
      <p>3.2.1 Problem formulation. Let O = {o1, ..., ok } be a set of K
objects (i.e., moving objects) and S = {s1, ..., sk } a set of real-time
streams of events, where si is generated by the object oi . Let P be
a user-defined pattern which we want to apply to every stream
si , i.e., each object will have its own DFA.</p>
      <p>The setting that is considered in this work is then described
in the following: we have K input event streams S and a system
consisting of K distributed predictor nodes n1, n2..., nk , each of
which consumes an input event stream si ∈ S. The goal is to
provide timely predictions and be able to do this at large-scale.
Each node ni handles a single event stream si associated with a
moving object oi ∈ O. In addition, it maintains a local prediction
model fi for the user-defined pattern P. The fi model provides
the online prediction about the future full match of the pattern
P in si for each new arriving event tuple.</p>
      <p>In short, we have multiple running instances of an online
prediction algorithm on distributed nodes for multiple input
event streams. More specifically, the input to our system consists
of massive streams of events that describe trajectories of moving
vessels in the context of maritime surveillance, where there is
one predictor node for each vessel’s event stream.</p>
      <p>
        3.2.2 The proposed approach. We designed and developed a
scalable and distributed pattern prediction system over a massive
input event streams of moving objects. As the base prediction
model, we use the PMC forecasting method [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Moreover, we
propose to enable the information exchange between the distributed
predictors/learners of the input event streams, by adapting the
distributed online prediction protocol of [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] to synchronize the
prediction models, i.e., the transitions probabilities matrix of the
PMC predictors.
      </p>
      <p>
        Algorithm 1 presents the distributed online prediction protocol
by dynamic model synchronization on both the predictor nodes
and the coordinator. We refer to the PMC’s transition matrix Πi
on predictor node ni by fi . That is, when a predictor ni : i ∈ [k]
observes an event ej it revises its internal model state (i.e., fi ) and
provides a prediction report. Then it checks the local conditions
(batch size b and local model divergence from a reference model
fr ) to decide whether there is a need to synchronize its local
model with the coordinator [or not]. fr is maintained in the
predictor node as a copy of the last computed aggregated model
fˆ from the previous full synchronization step, which is shared
between all local predictors/learners. By monitoring the local
condition ∥ fi − fr ∥2 &gt; ∆ on all local predictors, we have a
guarantee that if none of the local conditions is violated, the
j=1 ∥ fi − fˆ ∥2)
divergence (i.e., variance of local models δ (f ) = k1 Ík
does not exceed the threshold ∆ [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>On the other hand, the coordinator receives the prediction
models from the predictor nodes that requested for model
synchronization (violation). Then it tries to keep incrementally
querying other nodes for their local prediction models until reaching
out all nodes, or the variance of the aggregated model fˆ that is
computed from the already received models less or equal than
the divergence threshold ∆ . Finally, the aggregated model fˆ is
sent back to the predictor nodes that sent their models after the
violation or have been queried by the coordinator.</p>
      <p>Algorithm 1: Communication-eficient Distributed Online
Learning Protocol</p>
      <p>Predictor node ni : at observing event ej
update the prediction model parameters fi and provide a
prediction service ;
if j mod b = 0 and ∥ fi − fr ∥2 &gt; ∆ then</p>
      <p>send fi to the Coordinator (violation) ;
Coordinator:
receive local models with violation B = { fi }im=1 ;
while |B | , k and 1 Ífi ∈Π ∥ fi − fˆ ∥2 &gt; ∆ do
add other nodes|Bt h|at have not reported violation for
their models B ← { fl : fl &lt; B and l ∈ [k]} ;
receive models from nodes in B;
compute a new global model fˆ ;
send fˆ to all the predictors in B and set f1 . . . fm = fˆ ;
if |B | = k then</p>
      <p>set a new reference model fr ← fˆ ;</p>
      <p>
        This protocol was introduced for linear models, and has been
extended to handle kernelized online learning models [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. We
also employ this protocol for the pattern prediction model, which
is internally based on the PMC PMCm. This allows the distributed
P
PMCm predictors for multiple event streams to synchronize their
      </p>
      <p>P
models (i.e., the transition probability matrix of each predictor)
within the system in a communication-eficient manner.</p>
      <p>
        We propose a synchronization operation for the parameters of
the models (fi = Πi : i ∈ [k]) of the k distributed PMC predictors.
The operation is based on distributing the maximum-likelihood
estimation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] for the transition probabilities of the underlying
PMCm models described by:
      </p>
      <p>P
πˆi, j = Í
Í</p>
      <p>k ∈K nk,i, j
k ∈K Íl ∈L nk,i,l</p>
      <p>Moreover, we measure the divergence of local models from
the reference model ∥ fk − fr ∥2 by calculating the sum of square
diference between the transition probabilities Πi and Πr :
Õ
∥ fk − fr ∥2 =</p>
      <p>( πˆk i, j − πˆr i, j)2
i, j</p>
      <p>In general, our approach relies on enabling the collaborative
learning among the distributed predictors. Each predictor node
receives a stream of events related to a distinct moving object,
and the central coordinator is responsible for synchronizing their
prediction models using the synchronization operation. Moreover,
the predictors they only need to share the parameters of their
models, not the consumed event streams.</p>
      <p>
        We assume that the underlying event streams belong to the
same distribution and share the same behavior (e.g., mobility
patterns). We claim this assumption is reasonable in many
application domains: for instance, in the context of maritime
surveillance, vessels travel through standard routes, defined by the
International Maritime Organization (IMO). Additionally, vessels
have similar mobility patterns in specific areas such as moving
with low speed and multiple turns near the ports [
        <xref ref-type="bibr" rid="ref20 ref29">20, 29</xref>
        ]. That
allows our system to construct a coherent global prediction model
dynamically for all input event streams based on merging their
local prediction models.
3.3
      </p>
    </sec>
    <sec id="sec-6">
      <title>Distributed architecture</title>
      <p>Our system consumes as an input2 an aggregated stream of events
coming from a large number of moving objects, which is
continuously collected and fed into the system. It allows users to
register a pattern P to be monitored over each event stream of
a moving object. The output stream consists of original input
events and predictions of full matches of P, displayed to the end
users. Figure 3 presents the overview of our system architecture
and its main components.</p>
      <p>The system is composed of three processing units: (i)
pre-processing operators that receive the input event stream and
perform filtering and ordering operations, before partitioning the
input event stream to multiple event streams based on the
associated moving object (ii) predictor nodes (learners), which are
responsible for maintaining a prediction model for the input
event streams. Each prediction node is configured to handle an
event stream from the same moving object, in order to provide
online predictions for a predefined pattern P (iii) a coordinator
node that communicates through Kafka stream channels with
the predictors to realize the distributed online learning protocol.
It builds a global prediction model, based on the received local
models, and then shares it among the predictors.</p>
      <p>
        Our distributed system consists of multiple pre-processing
operators, prediction nodes, and a central coordinator node. These
units run concurrently and are arranged as a data processing
pipeline, depicted in Figure 3. We leverage Apache Kafka as a
messaging platform to ingest the input event streams and to
publish the resulting streams. Also, it is used as the
communication channel between the predictor nodes and the coordinator.
Apache Flink is employed to execute the system’s distributed
processing units over the input event streams: the pre-processing
operators, the prediction units, and the coordinator node. Our
system architecture can be modeled as a logical network of
processing nodes, organized in the form of a DAG, inspired by the
Flink runtime dataflow programs [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
2In practice, the aggregated input events stream is composed of multiple event
streams (partitions) from a set of moving objects.
      </p>
    </sec>
    <sec id="sec-7">
      <title>4 IMPLEMENTATION DETAILS</title>
      <p>In this section, we briefly describe in detail the implementation of
our system on top of Apache Flink and Apache Kafka frameworks.
Each of the three sub-modules, described in Section 3.3, have
been implemented as Flink operations over the Kafka events
stream.</p>
      <sec id="sec-7-1">
        <title>Pre-processing and Prediction Operators. Listing 1 shows</title>
        <p>how the main workflow of the system is implemented as Flink
data flow program.</p>
        <p>The system ingests the input events stream from a Kafka
cluster that is mapped to a DataStream of events, which is then
processed by an EventTuplesMapper to create tuples of (id, event),
where the id is associated to the identifier of the moving object.
To handle events coming in out of order in a certain margin, the
stream of event tuples is processed by a TimestampAssigner, it
assigns the timestamps for the input events based on the extracted
creation time. Afterwards, an ordered stream of event tuples is
generated using a process function EventSorter.</p>
        <p>DataStream&lt;Event&gt; eventsStream =</p>
        <p>env.addSource(kafkaConsumer);
// Create tuples (id,event) and assign time stamps
DataStream&lt;Tuple2&lt;String,Event&gt;&gt; eventTuplesStream =
inputEventsStream.map(new EventTuplesMapper())
.assignTimestampsAndWatermarks(new</p>
        <p>EventTimeAssigner());
// Create the ordered keyed stream
orderedEventsStream =
eventsStream.keyBy(0).process(new</p>
        <p>EventSorter()).keyBy(0);
// Consume the events by the predictors
LocalPredictorNode predictorNode =new</p>
        <p>LocalPredictorNode&lt;Event&gt;(P);
DataStream&lt;Event&gt; processedEventsStream =
orderedEventsStream.map(predictorNode);</p>
      </sec>
      <sec id="sec-7-2">
        <title>Listing 1: Flink pipeline for local predictors workflow</title>
        <p>The ordered stream is then transformed to a keyedEventsStream
by partitioning it, based on the ids values, using a keyBy
operation. A local predictor node in a distributed environment is
represented by a map function over the keyedEventsStream. Each
parallel instance of the map operator (predictor) always
processes all events of the same moving object (i.e., equivalent id),
and maintains a bounded prediction model (i.e., PMCm predictor)
P
using the Flink’s Keyed State 3. The output streams of the moving
objects from the parallel instances of the predictor map functions
are sent to a new Kafka stream (i.e., same topic name). They then
can be processed by other components like visualization or users
notifier.</p>
        <p>Moreover, the implementation of the predictor map function
includes the communication with coordinator using Kafka streams.
At the beginning of the execution, it sends a registration request
to the coordinator. Also at the run-time, it sends its local
prediction model as synchronization request, or as a response for a
resolution request from the coordinator. These communication
messages are published into diferent Kafka topics as depicted in
Table 1.</p>
        <p>Coordinator. It manages the distributed online learning
protocol operations, which is also implemented as Flink program.
The coordinator receives messages from the local predictors
3Keyed State in Flink:
https://ci.apache.org/projects/flink/flink-docs-release1.3/dev/stream/state.html#kayed-state
through a Kafka Stream of a topic named
"LocalToCoordinatorTopicId". It is implemented as a single map function over the
messages stream, by setting the parallelism level of the Flink
program to "1". Increasing the parallelism will scale up the number
of parallel coordinator instances, for example, in order to handle
diferent groupings of the input event streams. The map
operator of the coordinator handles three message types from the
predictors: (i) RegisterNode that contains a registration request
for a new predictor node, (ii) RequestSync to receive a local
model after violation, (iii) ResolutionAnswer to receive a
resolution response from a local predictor node. In addition, it sends
CoordinatorSync messages for all predictors after creating a
new global prediction model, or RequestResolution to a ask
the local predictors for their prediction models.
5</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>EMPIRICAL EVALUATION</title>
      <p>
        In this section, we evaluate our proposed system by analyzing
the predictive performance and communication complexity
using real-world event streams provided by the datAcron project
in the context of maritime monitoring. The used event streams
describe critical points (i.e., synopses) of moving vessels
trajectories, which are derived from raw AIS messages as described in
[
        <xref ref-type="bibr" rid="ref30">30</xref>
        ]. In particular, for our evaluation experiments we used a data
set of synopses that contains 4, 684, 444 critical points of 5055
vessels sailing in the Atlantic Ocean during the period from 1
October 2015 to 31 March 2016.
      </p>
      <p>
        We used the synopses data set to generate a simulated stream
of event tuples i.e., (id, timestamp, longitude, latitude, annotation,
speed, heading), which are processed by the system to attach an
extra attribute type that represents the event value, where type ∈ Σ,
and Σ = Σ1 ={VerySlow, Slow, Moving, Sailing, Stopping}, which is
based on a discretization of the speed values. That is, Σ1 includes
a simple derived event types based on the speed value that can
be used over streams of raw AIS or critical points. Or Σ = Σ2 =
{stopStart, stopEnd, changeInSpeedStart, changeInSpeedEnd,
slowMotionStart, slowMotionEnd, gapStart, gapEnd, changeInHeading},
which is derived based on the values of the annotation attribute
that encodes the extracted trajectory movement events [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ]. Σ2
represents the set of possible mobility changes in the vessel’s
trajectory [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ], each critical point has at least one event. Where
in the case of multiple values, we generate duplicate points each
of which corresponding to one event in the same order of Σ2.
      </p>
      <p>
        In our experiments, we monitor a pattern P1 = Sailinд with
Σ1 that detects when the vessel is underway (sailing). Likewise,
we test a second pattern P2 =changeInHeading; gapStart; gapEnd;
changeInHeading with Σ2 that describes a potential illegal fishing
activity [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>Experimental setup. We ran our experiments on single-node
standalone Flink cluster deployed on an Ubuntu Server 17.04
with Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz X 8 processors
and 32GB RAM. We used Apache Flink v1.3.2 and Apache Kafka
v0.10.2.1 for our tests.</p>
      <p>Evaluation criteria. Our goal is to evaluate our distributed
pattern prediction system, which enables the synchronization of
prediction models (i.e., PMC models) on the distributed
predictor nodes. Our proposed system can operate in three diferent
modes of protocols/schemes of models synchronization: (i) static
scheme based on synchronizing the prediction models
periodically every b of input events in each stream, (ii) continuous, full
synchronization for each incoming event (hypothetical), (iii)
dynamic synchronization protocol based on making the predictors
communicate their local prediction models periodically but only
under condition that the divergence of the local models from a
reference model exceeds a variance threshold ∆ (recommended).</p>
      <p>We compare our proposed system against the isolated
prediction mode, in which models are computed on single streams only,
and compare the predictive performance in terms of :
(i) Precision = ##ooffcotortraecltpprerdedicitcitoionnss is the fraction of the
produced predictions that are correct. For each new event in
the stream, the predictor provides a prediction interval
where the full match of the pattern might occur. Thus, the
predictions are temporarily stored until a full match is
detected. At that point, all stored prediction intervals are
evaluated by considering those intervals where the full
match occurred within as correct.
(ii) Spread= end(I ) − start (I ) is the width of the prediction
interval I , which represents the number of events between
the start and the end of I .</p>
      <p>Moreover, we study the communication cost by measuring the
cumulative communication that captures the number of messages,
which are required to perform the distributed online learning
modes to synchronize the prediction models. Next, we present
the experimental results for the patterns P1 = Sailinд with an
order of m = 2, and P2 =changeInHeading; gapStart; gapEnd;
changeInHeading with first order m = 1. All experiments are
performed with setting the batch size to 100 (b = 100), the
variance threshold of 2 (∆ = 2), 80% as PMC prediction threshold
(θf c = 80%), and 200 for the maximum spread.</p>
      <p>Experimental results. Figure 4 depicts the average precision
scores of predictions models (one prediction model per vessel)
of all synchronization modes for the first pattern P1 = Sailinд,
namely, isolated without synchronization, continuous (full-sync),
static, and our recommended approach based on the dynamic
synchronization scheme. It can be clearly seen that all methods
of distributed learning outperform the isolated prediction models.
The hypothetical method of full continuous synchronization has
the highest precision rates, while the static and dynamic
synchronization schemes have close precision scores. Consequently,
dynamic synchronization is not much weaker than the static
synchronization, but requires much less communication, as
explained below.</p>
      <p>Figure 5 provides the amount of the accumulated
communication that is required by the three modes of the distributed online
learning, while the isolated approach does not require any
communication between the predictors. These results are shown for
P1. As expected, a larger amount of communication is required
for the continuous synchronization comparing to the static and
dynamic approaches. Also, it can be seen that we can reduce the
communication overhead by applying the dynamic
synchronization protocol (a reduction by a factor of 100) comparing to the
static synchronization scheme, even with a small variance
threshold ∆ = 2. Furthermore, the dynamic protocol is still preserving
a close predictive performance to the static one (see Figure 4).
Therefore, we will only consider the dynamic synchronization
and the isolated approach in the evaluation of the second pattern.</p>
      <p>In Figure 4, we also noted that the precision is going down in a
ifrst phase and stabilizes then. This seems to be counter-intuitive,
as the models should improve when getting more data up to a
certain point. For explanation, we have investigated the efect
of the distributed synchronization of the prediction models on
the average spread value, Figure 6 shows the spread results for
all approaches. It can be seen that the spread is higher for the
distributed learning based methods comparing to the isolated
approach. Furthermore, the average spread is decreasing over
time until convergence, as result of confidence increase in the
models. This may explain the drop in the precision scores from
the beginning until reaching the convergence. We will investigate
further in the interrelation between precision and spread in future
work.</p>
      <p>For the second, more complex pattern (P2), we have found that
the precision was worse for a distributed model generated over all
vessels than in the model created for each vessel in isolation. This
indicates that there is no global model describing the behavior of
all models consistently. However, when looking at specific groups
of vessels, we achieved an improvement in terms of precision.
As initial experiment, we only enable the synchronization of the
prediction models associated with vessels that belong to the same
vessel class. Currently, this change is technically performed by
an extra filter step that passes only one type of vessels, while
multiple runs of the system are required for all vessel types. For
example, Figure 7 shows the precision scores for vessels of class
PLEASURE CRAFT. An interesting observation is that the dynamic
synchronization approach still has higher precision scores than
the isolated approach. This case might seem to contradict of
our assumption that the input event streams belong to the same
distribution and share the same behavior, but it actually follows
the same assumption but between the predictors of vessels within
the same type group. We will further investigate the efect of
groupings and more patterns in future work.
6</p>
    </sec>
    <sec id="sec-9">
      <title>CONCLUSION</title>
      <p>
        In this paper, we have presented a system that provides a
distributed pattern prediction over multiple large-scale event streams
of moving objects (vessels). The system uses the event
forecasting with Pattern Markov Chain (PMC) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] as the base prediction
model on each event stream, and it applies the protocol for
distributed online prediction [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] to exchange information between
the prediction models over multiple input event streams. Our
proposed system has been implemented using Apache Flink and
Apache Kafka. In order to show the usefulness and efectiveness
of our approach, we empirically tested it against large real-world
event streams related to trajectories of moving vessels.
      </p>
      <p>As future work, we will address the open issues emerging
from the current findings. Firstly, we will study the interrelation
between precision and spread scores by validating the approach
over synthetic event streams. Secondly, we will investigate the
efect of grouping the input event streams on the predictive
performance of our method.</p>
    </sec>
    <sec id="sec-10">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work was supported by EU Horizon 2020 datAcron project
(grant agreement No 687591).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Rakesh</given-names>
            <surname>Agrawal</surname>
          </string-name>
          , Tomasz Imieliński, and
          <string-name>
            <given-names>Arun</given-names>
            <surname>Swami</surname>
          </string-name>
          .
          <year>1993</year>
          .
          <article-title>Mining Association Rules Between Sets of Items in Large Databases</article-title>
          .
          <source>In ACM SIGMOD.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Elias</given-names>
            <surname>Alevizos</surname>
          </string-name>
          , Alexander Artikis, and
          <string-name>
            <given-names>George</given-names>
            <surname>Paliouras</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Event Forecasting with Pattern Markov Chains</article-title>
          .
          <source>In Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems. ACM</source>
          ,
          <volume>146</volume>
          -
          <fpage>157</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Elias</given-names>
            <surname>Alevizos</surname>
          </string-name>
          , Anastasios Skarlatidis, Alexander Artikis, and
          <string-name>
            <given-names>Georgios</given-names>
            <surname>Paliouras</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Complex event recognition under uncertainty: A short survey</article-title>
          .
          <source>Event Processing</source>
          ,
          <article-title>Forecasting and Decision-Making in the Big Data Era (EPForDM) (</article-title>
          <year>2015</year>
          ),
          <fpage>97</fpage>
          -
          <lpage>103</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Theodore W Anderson and Leo A Goodman</surname>
          </string-name>
          .
          <year>1957</year>
          .
          <article-title>Statistical inference about Markov chains</article-title>
          .
          <source>The Annals of Mathematical Statistics</source>
          (
          <year>1957</year>
          ),
          <fpage>89</fpage>
          -
          <lpage>110</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Amazon</given-names>
            <surname>Web</surname>
          </string-name>
          <article-title>Services (AWS</article-title>
          ).
          <year>2013</year>
          . Amazon Kinesis. https://aws.amazon. com/de/kinesis/. (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6] Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and
          <string-name>
            <given-names>Kostas</given-names>
            <surname>Tzoumas</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Apache flink: Stream and batch processing in a single engine</article-title>
          .
          <source>Bulletin of the IEEE Computer Society Technical Committee on Data Engineering</source>
          <volume>36</volume>
          ,
          <issue>4</issue>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Gianpaolo</given-names>
            <surname>Cugola</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Margara</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Processing Flows of Information: From Data Stream to Complex Event Processing</article-title>
          .
          <source>ACM Comput. Surv</source>
          .
          <volume>44</volume>
          ,
          <issue>3</issue>
          ,
          <string-name>
            <surname>Article 15</surname>
          </string-name>
          (
          <year>June 2012</year>
          ),
          <volume>62</volume>
          pages. https://doi.org/10.1145/2187671.2187677
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Ofer</given-names>
            <surname>Dekel</surname>
          </string-name>
          , Ran Gilad-Bachrach,
          <string-name>
            <given-names>Ohad</given-names>
            <surname>Shamir</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Lin</given-names>
            <surname>Xiao</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Optimal distributed online prediction using mini-batches</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>13</volume>
          ,
          <string-name>
            <surname>Jan</surname>
          </string-name>
          (
          <year>2012</year>
          ),
          <fpage>165</fpage>
          -
          <lpage>202</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Lina</given-names>
            <surname>Fahed</surname>
          </string-name>
          , Armelle Brun, and
          <string-name>
            <given-names>Anne</given-names>
            <surname>Boyer</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Eficient Discovery of Episode Rules with a Minimal Antecedent and a Distant Consequent</article-title>
          .
          <source>In Knowledge Discovery, Knowledge Engineering and Knowledge Management</source>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Ioannis</surname>
            <given-names>Flouris</given-names>
          </string-name>
          , Vasiliki Manikaki, Nikos Giatrakos, Antonios Deligiannakis, Minos Garofalakis, Michael Mock,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Bothe</surname>
          </string-name>
          , Inna Skarbovsky, Fabiana Fournier,
          <string-name>
            <given-names>Marko</given-names>
            <surname>Stajcer</surname>
          </string-name>
          , et al.
          <year>2016</year>
          .
          <article-title>FERARI: A Prototype for Complex Event Processing over Streaming Multi-cloud Platforms</article-title>
          .
          <source>In Proceedings of the 2016 International Conference on Management of Data. ACM</source>
          ,
          <year>2093</year>
          -
          <fpage>2096</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>[11] The Apache Software Foundation</source>
          .
          <year>2012</year>
          . Apache Kafka. https://kafka.apache. org/. (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <source>[12] The Apache Software Foundation</source>
          .
          <year>2013</year>
          .
          <article-title>Apache Spark Streaming</article-title>
          . http: //spark.apache.org/streaming/. (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>[13] The Apache Software Foundation</source>
          .
          <year>2014</year>
          . Apache Flink. https://flink.apache. org/. (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <source>[14] The Apache Software Foundation</source>
          .
          <year>2014</year>
          . Apache Storm. http://storm.apache. org/. (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>John</surname>
            <given-names>E Hopcroft</given-names>
          </string-name>
          , Rajeev Motwani, and
          <string-name>
            <given-names>Jefrey D</given-names>
            <surname>Ullman</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Automata theory, languages, and computation</article-title>
          .
          <source>International Edition</source>
          <volume>24</volume>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Kamp</surname>
          </string-name>
          , Mario Boley, Daniel Keren, Assaf Schuster, and
          <string-name>
            <given-names>Izchak</given-names>
            <surname>Sharfman</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Communication-eficient distributed online prediction by dynamic model synchronization</article-title>
          .
          <source>In Joint European Conference on Machine Learning and Knowledge Discovery in Databases</source>
          . Springer,
          <fpage>623</fpage>
          -
          <lpage>639</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Kamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Bothe</surname>
          </string-name>
          , Mario Boley, and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Mock</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Communication-Eficient Distributed Online Learning with Kernels</article-title>
          .
          <source>In Joint European Conference on Machine Learning and Knowledge Discovery in Databases</source>
          . Springer,
          <fpage>805</fpage>
          -
          <lpage>819</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>John</surname>
            <given-names>Langford</given-names>
          </string-name>
          , Alex J Smola, and
          <string-name>
            <given-names>Martin</given-names>
            <surname>Zinkevich</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Slow learners are fast</article-title>
          .
          <source>Advances in Neural Information Processing Systems</source>
          <volume>22</volume>
          (
          <year>2009</year>
          ),
          <fpage>2331</fpage>
          -
          <lpage>2339</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Srivatsan</surname>
            <given-names>Laxman</given-names>
          </string-name>
          , Vikram Tankasali, and
          <string-name>
            <surname>Ryen</surname>
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>White</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Stream Prediction Using a Generative Model Based on Frequent Episodes in Event Sequences</article-title>
          .
          <source>In ACM SIGKDD.</source>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Bo</surname>
            <given-names>Liu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Erico N de Souza</surname>
            , Stan Matwin, and
            <given-names>Marcin</given-names>
          </string-name>
          <string-name>
            <surname>Sydow</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Knowledgebased clustering of ship trajectories using density-based approach</article-title>
          .
          <source>In Big Data (Big Data)</source>
          ,
          <source>2014 IEEE International Conference on. IEEE</source>
          ,
          <fpage>603</fpage>
          -
          <lpage>608</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>David</given-names>
            <surname>Luckham</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>The power of events: An introduction to complex event processing in distributed enterprise systems</article-title>
          .
          <source>In International Workshop on Rules and Rule Markup Languages for the Semantic Web</source>
          . Springer, 3-
          <fpage>3</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Heikki</surname>
            <given-names>Mannila</given-names>
          </string-name>
          ,
          <article-title>Hannu Toivonen, and</article-title>
          <string-name>
            <given-names>A. Inkeri</given-names>
            <surname>Verkamo</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>Discovery of Frequent Episodes in Event Sequences. Data Mining and Knowledge Discovery (</article-title>
          <year>1997</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Mathioudakis</surname>
          </string-name>
          and
          <string-name>
            <given-names>Nick</given-names>
            <surname>Koudas</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Twittermonitor: trend detection over the twitter stream</article-title>
          .
          <source>In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM</source>
          ,
          <volume>1155</volume>
          -
          <fpage>1158</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Daniele</surname>
            <given-names>Miorandi</given-names>
          </string-name>
          , Sabrina Sicari, Francesco De Pellegrini, and
          <string-name>
            <given-names>Imrich</given-names>
            <surname>Chlamtac</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Internet of things: Vision, applications and research challenges</article-title>
          .
          <source>Ad Hoc Networks</source>
          <volume>10</volume>
          ,
          <issue>7</issue>
          (
          <year>2012</year>
          ),
          <fpage>1497</fpage>
          -
          <lpage>1516</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Douglas</surname>
            <given-names>C Montgomery</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cheryl L Jennings</surname>
            , and
            <given-names>Murat</given-names>
          </string-name>
          <string-name>
            <surname>Kulahci</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Introduction to time series analysis and forecasting</article-title>
          . Wiley.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Vinod</surname>
            <given-names>Muthusamy</given-names>
          </string-name>
          , Haifeng Liu, and
          <string-name>
            <surname>Hans-Arno Jacobsen</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Predictive Publish/Subscribe Matching</article-title>
          .
          <source>In DEBS. ACM.</source>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Grégory</given-names>
            <surname>Nuel</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Pattern Markov Chains: Optimal Markov Chain Embedding through Deterministic Finite Automata</article-title>
          .
          <source>Journal of Applied Probability</source>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>International</surname>
            <given-names>Maritime</given-names>
          </string-name>
          <string-name>
            <surname>Organization</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Automatic identification systems</article-title>
          . http://www.imo.org/OurWork/Safety/Navigation/Pages/AIS.aspx. (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Giuliana</surname>
            <given-names>Pallotta</given-names>
          </string-name>
          , Michele Vespe, and
          <string-name>
            <given-names>Karna</given-names>
            <surname>Bryan</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Vessel pattern knowledge discovery from AIS data: A framework for anomaly detection and route prediction</article-title>
          .
          <source>Entropy</source>
          <volume>15</volume>
          ,
          <issue>6</issue>
          (
          <year>2013</year>
          ),
          <fpage>2218</fpage>
          -
          <lpage>2245</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Kostas</surname>
            <given-names>Patroumpas</given-names>
          </string-name>
          , Elias Alevizos, Alexander Artikis, Marios Vodas, Nikos Pelekis, and
          <string-name>
            <given-names>Yannis</given-names>
            <surname>Theodoridis</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Online event recognition from moving vessel trajectories</article-title>
          .
          <source>GeoInformatica 21</source>
          ,
          <issue>2</issue>
          (
          <year>2017</year>
          ),
          <fpage>389</fpage>
          -
          <lpage>427</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Kostas</surname>
            <given-names>Patroumpas</given-names>
          </string-name>
          , Alexander Artikis, Nikos Katzouris, Marios Vodas, Yannis Theodoridis, and
          <string-name>
            <given-names>Nikos</given-names>
            <surname>Pelekis</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Event Recognition for Maritime Surveillance.</article-title>
          .
          <source>In EDBT</source>
          .
          <volume>629</volume>
          -
          <fpage>640</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>R.</given-names>
            <surname>Vilalta</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sheng</given-names>
            <surname>Ma</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Predicting rare events in temporal domains</article-title>
          .
          <source>In ICDM.</source>
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>Lin</given-names>
            <surname>Xiao</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Dual averaging methods for regularized stochastic learning and online optimization</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>11</volume>
          ,
          <string-name>
            <surname>Oct</surname>
          </string-name>
          (
          <year>2010</year>
          ),
          <fpage>2543</fpage>
          -
          <lpage>2596</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Feng</surname>
            <given-names>Yan</given-names>
          </string-name>
          , Shreyas Sundaram, SVN Vishwanathan, and
          <string-name>
            <given-names>Yuan</given-names>
            <surname>Qi</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Distributed autonomous online learning: Regrets and intrinsic privacy-preserving properties</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>25</volume>
          ,
          <issue>11</issue>
          (
          <year>2013</year>
          ),
          <fpage>2483</fpage>
          -
          <lpage>2493</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Cheng</surname>
            <given-names>Zhou</given-names>
          </string-name>
          , Boris Cule, and
          <string-name>
            <given-names>Bart</given-names>
            <surname>Goethals</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>A pattern based predictor for event streams</article-title>
          .
          <source>Expert Systems with Applications</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>