<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>March</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Automated Pattern Extraction in Complex Event Processing Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Styliani Kyrama</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anastasios Gounaris</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Informatics, Aristotle University of Thessaloniki</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <volume>24</volume>
      <issue>2026</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Complex Event Processing (CEP) is employed in various domains to detect the occurrence of sequential patterns of interest, thus enabling timely reactions and informed decision-making. Although CEP systems can eficiently detect predefined patterns, they exhibit an important limitation; they do not examine detected patterns for systematic continuations or alternative endings that could enrich the information provided to users. Several works attempt to discover new patterns, drawing from the fields of CEP, Sequential Pattern Mining, Machine Learning, and Deep Learning; however, these approaches typically require labeled data, perform ofline analysis, or both. To address this limitation, we propose a mechanism for CEP systems that employs a pattern-trie structure to maintain an overview of pattern evolution, together with three update algorithms. We provide an efective and eficient solution for exploring frequent extensions and variations of the original pattern online, during CEP execution. Our evaluation indicates that online pattern exploration can be integrated with CEP execution without significant additional overhead relative to the underlying CEP execution.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;CEP</kwd>
        <kwd>automated pattern detection</kwd>
        <kwd>pattern-trie</kwd>
        <kwd>frequent pattern evolution</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Modern systems increasingly process large volumes of data to extract valuable knowledge and insights,
often originating from sensors or other Internet of Things (IoT) devices, rather than traditional databases
[
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Such processing may include the detection of sequential pattern occurrences, the generation
of alerts, and timely updates to users, enabling rapid reaction and informed decision-making [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. To
support these requirements, many systems across diverse domains employ Complex Event Processing
(CEP) solutions [
        <xref ref-type="bibr" rid="ref4 ref5 ref6 ref7">4, 5, 6, 7</xref>
        ]. Despite the maturity of CEP, most of the existing approaches rely on
pattern queries specified by domain experts. For example, in the field of healthcare, a doctor can define
situations of interest that they want to monitor for a specific patient. However, defining a descriptive
and accurate set of rules or patterns is challenging, even for a domain expert. Subtle relationships or
unknown correlations within the data may be overlooked. This dependency limits their applicability
in environments where domain experts cannot easily specify all relevant patterns or where event
correlations evolve dynamically.
      </p>
      <p>
        Many optimizations have been proposed in CEP systems, either to improve performance even in
demanding scenarios [
        <xref ref-type="bibr" rid="ref10 ref11 ref8 ref9">8, 9, 10, 11</xref>
        ] or to enhance accuracy under real-world inconsistencies [
        <xref ref-type="bibr" rid="ref12 ref13 ref14">12, 13, 14</xref>
        ],
such as out-of-order or late arrivals. Yet, all these solutions assume that the pattern logic is predefined,
which in practice, is very restrictive. Several research eforts have attempted to move beyond static,
pre-defined CEP queries by learning or discovering patterns from historical or operational data.
      </p>
      <p>
        In this work, we propose an online automatic pattern extraction approach for CEP systems,
implemented as an add-on on top of the LimeCEP engine[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. While traditional CEP systems detect only the
original pattern explicitly defined by the user, our approach explores and maintains frequent pattern
evolutions during CEP execution, with minimal overhead. We introduce a lightweight pattern-trie
architecture together with three complementary counting algorithms that enable practical online
pattern suggestion within the CEP engine, while preserving throughput characteristics under demanding
workloads. Specifically, we consider two types of evolutions: (i) extended patterns, which append
additional event types to the original pattern to capture richer event correlations, and (ii) altered patterns,
which replace the final event type of the pattern and act as complementary suggestions that may reveal
relevant event structures overlooked by the user. Unlike prior work on sequential or episode mining, our
approach performs online exploration of pattern evolutions directly inside the CEP engine, preserving
the exact execution semantics of the original pattern query, including window and selection policies. For
example, given a user-defined pattern ABC, an extended evolution is ABCD, while an altered evolution
is ABE. Our evaluation shows that automated pattern extraction can run alongside CEP execution on
large windows and real-world streams, without introducing additional throughput overhead.
      </p>
      <p>Outline. Next, we provide background information and we formalize our problem. The architecture
and our novel algorithms are presented in Section 3. We evaluate our proposal in Section 4, and we
discuss related work in Section 5, before we conclude in Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries</title>
      <sec id="sec-2-1">
        <title>2.1. CEP Basics</title>
        <p>
          CEP systems continuously evaluate pattern queries over incoming event streams. A pattern is typically
an ordered sequence of event types ⟨1, . . . , ⟩, evaluated under a selection policy (e.g.,
skip-tillnext-match (STNM) or skip-till-any-match (STAM)), a consumption policy (e.g. consume, reuse), and
time constraints  [
          <xref ref-type="bibr" rid="ref1 ref3">3, 16, 1</xref>
          ]. During runtime, the CEP engine detects matches of the pattern from
the events in the incoming stream. Formally, given a stream  = ⟨1, 2, . . . ⟩, where  is an event at
timestamp , a match of pattern  = (⟨1, . . . , ⟩, , ) is a subsequence ⟨1 , . . . ,  ⟩ such that
type( ) = , 1 . &lt; · · · &lt;   ., and  . −  1 . ≤  (. denotes the timestamp of ). CEP
engines implement these semantics through diferent execution models (e.g., NFA-, tree-, or graph–based
architectures [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]). Our method is implemented on top of LimeCEP [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], but the problem formulation
relies on generic CEP semantics. Without loss of generality, in this work we assume that the consume
policy is always set to reuse [
          <xref ref-type="bibr" rid="ref1 ref3">3, 1</xref>
          ].
        </p>
        <p>Limitations of current CEP engines. Existing CEP systems operate strictly with user-defined
pattern queries. This leads to two core limitations: (i) lack of runtime adaptability, i.e., the CEP system
cannot integrate new correlations or shifts of incoming data in the active pattern; and (ii) lack of
introspective analysis, i.e., the engines do not examine the produced matches for frequent continuations,
or alternative end events that may systematically appear in the incoming event stream.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Motivation Example</title>
        <p>Consider a smart-building IoT system monitoring HVAC units, vibration sensors, and power
consumption. A domain expert submits to the CEP system the pattern</p>
        <p>= (⟨TempRise, FanStart, PowerIncrease⟩,  = STNM,  = 10)
in order to detect potential overheating episodes.</p>
        <p>However, during continuous monitoring and evaluation of the stream, it is observed that after a
sequence of ⟨TempRise, FanStart, PowerIncrease⟩ events, a VibrationSpike event often occurs,
indicating that the fan is starting to operate under mechanical stress. Additionally, it is observed that
after the arrival of the ⟨TempRise, FanStart⟩ combination, a PressureAlert event regularly occurs
besides PowerIncrease, suggesting an alternative expression of the same physical condition. A
CEP engine will continue to match only the expert-defined pattern , without revealing the frequent
extension</p>
        <p>+ = (⟨TempRise, FanStart, PowerIncrease, VibrationSpike⟩,  = STNM,  = 10)
or the frequent variation</p>
        <p>′ = (⟨TempRise, FanStart, PressureAlert⟩,  = STNM,  = 10).</p>
        <p>Without a mechanism that identifies such frequently recurring extensions or variations of the original
pattern, the CEP system provides only a partial view of the actual behaviour observed in the stream.
This highlights the need for CEP engines to enhance user-defined patterns with meaningful suggestions
observed directly in the incoming stream.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Problem Definition</title>
        <p>
          In this section, we formalize the problem of systematically exploring frequent extensions and variations
of a user-defined CEP pattern query from an incoming event stream. The notation used in this Section
follows that of [
          <xref ref-type="bibr" rid="ref1 ref3">3, 16, 1</xref>
          ].
        </p>
        <p>Given a pattern  defined as a sequence of events ⟨1, 2, . . . , ⟩, with selection policy as ,
window , and a continuous incoming event stream , the goal is to explore two kinds of pattern
evolutions, while maintaining same , and :
• Extensions +: patterns of the form + = (⟨1, 2, . . . , , +1⟩, , ), obtained by
appending a new event type +1 to the end of  event sequence; and
• Variations ′ : patterns of the form ′ = (⟨1, 2, . . . , −1 , ′⟩, ), obtained by replacing
the last event type of  event sequence with a new event type ′ that does not already appear
in .</p>
        <p>Each possible pattern evolution, either + or ′ , should satisfy two conditions: (i) consistency with
the event-time semantics of the original pattern , i.e. satisfy the window and temporal constraints; and
(ii) its confidence must exceed user-defined threshold. Regarding the second condition, an extension of
the pattern + = ⟨1, . . . , , +1⟩, should satisfy
 (⟨1, . . . , +1⟩) = freq(⟨∑︁1, . . .f,req(+)1⟩) ≥  conf
∈{p,p+,p′ }
where freq(⟨ET1, . . . , ETn+1⟩) denotes the frequency of event type sequence ⟨1, . . . , +1⟩, i.e., the
number of valid occurrences in the stream , and ∑︀∈{,+,′ } freq() indicates the number of matches
including all possible extensions and variations. Corresponding confidence constraints apply also to
variations ′ . The exploration of such possible pattern evolutions must be performed incrementally, in
a streaming manner.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Automated Pattern Extraction in LimeCEP</title>
      <p>
        The pattern exploration as defined in Section 2.3 is implemented on top of LimeCEP engine [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], as an
add-on component, without modifying the core CEP semantics. The source code of our implementation
is publicly available1. Its integration within the overall system architecture is shown in Figure 1. In a
nutshell, LimeCEP is a lightweight CEP engine, designed for real-time pattern detection over event
streams with out-of-order arrivals, and high input rates. Instead of computing and maintaining partial
matches as NFA-based CEP solutions, LimeCEP indexes incoming events in a timestamp-ordered
inmemory structure, and evaluates patterns retrospectively. This approach does not sufer from explosion
of the space required when partial matches increase, and enables eficient pattern detection under
disorder, while partial reprocessing of events, combined with speculation, allows LimeCEP to correct
previously emitted matches when late events arrive.
      </p>
      <sec id="sec-3-1">
        <title>1https://github.com/KyraStyl/LimeCEP/tree/semi-automated</title>
        <p>SOURCES
LimeCEP Architecture</p>
        <p>EVENT RESULT
CONSUMERS MANAGER CEP ENGINE MANAGER
SHARED
TREESETS</p>
        <p>PROFILING
STATISTICAL MANAGER</p>
        <sec id="sec-3-1-1">
          <title>3.1. Architecture</title>
          <p>To support the continuous exploration and generation of new pattern candidates, either extensions
or variations of the user-defined pattern, we introduce two modules: (i) a Pattern Trie data structure,
which compactly maintains all possible explored patterns (along with their frequency and statistics), as
paths in a tree-like structure, similar to prefix trees, and (ii) a Pattern Trie Manager, which is responsible
for monitoring the incoming stream, and updating the pattern trie structure. Each root-to-leaf path in
the trie corresponds to a candidate pattern, and the associated counts are computed under the same
window and selection policy as the original query.</p>
          <p>Pattern Trie Manager. The Pattern Trie Manager (PTM) is responsible for exploring candidate
pattern extensions and variations, as formalized in Section 2.3. Conceptually, the PTM continuously
monitors the stream  for events of unknown types  , i.e., event types that do not appear in the
user-defined pattern . Whenever such an event is received, the PTM evaluates the corresponding
candidate patterns. More specifically, given a monitored pattern  = (⟨1, . . . , ⟩, , ) and a new
event type  , the PTM considers two candidate evolutions:
• the extended pattern + = (⟨⟨1, . . . , ⟩ ‖  ⟩, , );
• the varied pattern ′ = (⟨⟨1, . . . , −1 ⟩ ‖  ⟩, , ), where ⟨1, . . . , ⟩, i.e. the event-type
sequence of pattern  without its last symbol, is the prefix().</p>
          <p>
            For each such candidate pattern, the PTM estimates its frequency under the constraints imposed by
 and selection policy  of the original user-defined pattern . In this work, as already stated, we
do not consider event consumption policies, and therefore, we adopt the reuse policy (as defined in
[
            <xref ref-type="bibr" rid="ref1 ref3">3, 1</xref>
            ]), where events may participate in multiple matches without restrictions. Once the frequency
of a candidate evolution has been estimated, the PTM records this information in the pattern trie, by
updating its corresponding path. The update of pattern statistics is carried out using one of three
counting algorithms: (i) Brute Force, (ii) Reverse Count, and (iii) Incremental Count, which difer in
eficiency but yield identical results. These algorithms are introduced in Section 3.2. PTM uses the
trie to maintain a consistent view of the candidate space and to determine whether an extension or
variation satisfies the user-configured confidence threshold. Whenever a candidate pattern meets this
requirement, PTM suggests it to the user as a possible evolution to the original query.
          </p>
          <p>Pattern Trie Representation. The pattern trie is a prefix-tree structure that compactly stores all
patterns explored by the PTM. Each root-to-node path encodes a prefix ⟨1, . . . , ⟩ of a candidate
pattern, while each root-to-leaf path, whose final node is marked as ending, corresponds to a complete
candidate pattern. Using a shared-prefix hierarchy, it is ensured that common subsequences among the
candidate patterns are stored only once in the pattern trie structure. This enables eficient accumulation
of statistics across related patterns. Figure 2a illustrates the general structure of the pattern trie.</p>
          <p>Each node stores one event type, and its prefix is given implicitly by the path from the root to this
node. In order to support pattern maintenance and exploration, each node also stores:
root
ET1</p>
          <p>count(ET1)
ET2</p>
          <p>count(ET1ET2)
ETm-2
count(ET1...ETm-2)</p>
          <p>ETm'
ETm-1 count(ET1...ETm')</p>
          <p>count(ET1...ETm-1) ETm+ count(ET1...ETm+)
(a) Pattern Trie representation
• a count representing the number of valid matches in which this root-to-node prefix (sub-pattern)
participates, based on window  and selection policy  constraints, denoted as (1 . . . )
for each node  with event type ;
• an isEndOfPattern flag, indicating whether this node marks the end of a complete pattern
(represented as rounded boxes in Figure 2a);
• a set of children nodes, mapping event types to successor nodes, representing possible extensions
for this root-to-node prefix; and
• a reference to this node’s parent, for eficient navigation along the prefix, and computation of the
 , as defined in Section 2.3, for this root-to-node prefix or complete pattern.</p>
          <p>According to the formal definitions, the frequency of an event-type sequence is the number of valid
occurrences in the stream, whereas the count is the number of valid matches in which the sequence
participates. Although these two are closely related, they are not identical. The pattern trie stores only
the count for each prefix; however, the corresponding frequency of a sequence  can be derived using
the following relation:
freq() = ⎨⎧⎪count() −</p>
          <p>∑︁
∈children()
⎪⎩count(),
count(), if .isEndOfPattern == true,
otherwise.</p>
          <p>The pattern trie is updated whenever the PTM identifies an occurrence of a candidate pattern 
in the incoming event stream . The counts stored along the corresponding path  are incremented,
ensuring that prefix statistics are maintained across all explored patterns.</p>
          <p>Figure 2b illustrates how the pattern trie evolves as events are processed and matches are detected in
the stream. Consider the event stream  = ⟨1, 2, 3, 4, 5, 6, 7, 8, 9, 10⟩, and the pattern 
defined by the user is the following  = (⟨, , ⟩,  =   ,  = 10). The tree is initialized with
the root-to-leaf path , as shown in the left part of Figure 2b. When 6 arrives, the PTM detects
three valid matches, namely 1 = ⟨1, 3, 6⟩, 2 = ⟨2, 3, 6⟩, and 3 = ⟨4, 5, 6⟩ under the 
and  constraints imposed by , and updates the count of the  path of the pattern trie to 3.</p>
          <p>The arrival of 7 (middle figure) introduces an event type  not contained in , and therefore
triggers the exploration of the candidate patterns + = (⟨, , , ⟩,  =   ,  = 10) and
′ = (⟨, , ⟩,  =   ,  = 10). Correspondingly, a new ending child node  extends the 
prefix path, and another ending child node  is added to the  prefix path. During this exploration,
the PTM detects three matches for + path, 4+ = ⟨1, 3, 6, 7⟩, 5+ = ⟨2, 3, 6, 7⟩, and 6+ =
⟨4, 5, 6, 7⟩ and another three matches for path of ′ , ′7 = ⟨1, 3, 7⟩, ′8 = ⟨2, 3, 7⟩, and
′9 = ⟨4, 5, 7⟩, leading to updated counts of 6 for  and 3 for . The result pattern trie is
depicted in the middle part of the figure.</p>
          <p>The next event triggering this process for the PTM is event 10. Under the STNM policy, only one
new match is detected for the ′ candidate pattern, namely match ′10 = ⟨8910⟩. PTM updates the
0.3 &lt;   .</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.2. Update Algorithms</title>
          <p>corresponding  path, by incrementing the count by one, producing the final form of the pattern
trie as depicted in the right part of the figure. In each update step, PTM checks whether the candidate
pattern satisfies the confidence threshold. Considering a   equal to 40%, in the last step, i.e. on the
arrival of 10, where the candidate pattern  as a variation path is updated with () = 4,
PTM computes the  (⟨, , ⟩) as  (⟨, , ⟩) =  (⟨, , ⟩)/ ∑︀∈root.children  () =
()/#ℎ =4/10= 0.4 ≥   , meaning that among all valid matches detected the 40% is a
sequence of ⟨, , ⟩ (within the window and policy constraint as imposed by the original pattern ).
Therefore, the candidate pattern ⟨, , ⟩ is suggested to the user as an interesting pattern that could
alter the original one. In contrast, the extension ⟨, , , ⟩ is not yet suggested, since its confidence
remains below the required threshold, i.e.  (⟨, , , ⟩) =  (⟨, , , ⟩)/#ℎ =3/10=
To ensure that the pattern trie remains continuously updated and consistent with the evolving event
stream, we developed three algorithms of increasing eficiency for maintaining the counts and
computing global probabilities of candidate patterns. We present (i) a baseline Brute-Force method, that
enumerates matches using a full CEP evaluation, and two optimized variants exploiting LimeCEP’s
internal structures: (ii) Reverse Count, that reconstructs counts by traversing backwards through the
window, and (iii) Incremental Count, which reuses prefix counts that are stored in the   structure of
LimeCEP system, with minimal window correction. Each version produces identical logical results but
with increasingly improved runtime eficiency.
3.2.1. Brute-Force Algorithm
The brute-force method provides the most direct, but also the most expensive, way to update the
pattern trie. For every newly arrived event  whose type is not presence in the pattern types, candidate
patterns  = {+, ′ } are created and the algorithm runs an onDemand CEP process, to detect all the
corresponding matches for pattern  within the window  under selection policy . Each enumerated
match contributes one increment to the corresponding path  in the pattern trie.</p>
          <p>Algorithm 1: Brute-Force Trie Update</p>
          <p>This approach guarantees correctness and serves as a reference baseline. However, since every update
triggers a complete match re-evaluation, its cost is proportional to the full window size and becomes
prohibitive under high event rates or large windows. An illustrative example regarding the matches is
provided in Figure 3 continuing the example of Figure 2b.</p>
          <p>Complexity. The Brute-Force method recomputes all matches of each candidate pattern  by
processing the entire active window for this specific event that triggered the process. Since detection of
Require: Pattern , event stream 
1: prefix ← extractPrefix()
2: for each event  ∈  do
3: if ( = ()) ∈/  then
4: + ← prefix ‖ 
5: ′ ←  ‖ 
6: ifnd_matches( +)
7: update_trie_counts(+)
8: ifnd_matches( ′ )
9: update_trie_counts(′ )
10:
11:
else
ifnd_matches_and_update_trie_counts() // original pattern 
// prefix of the pattern
// pattern extension  +
// pattern variation  ′</p>
          <p>S = ⟨A1, A2, B3, A4, B5, C6, D7⟩</p>
          <p>Matches for p' = ABC
m1 A1 B3 C6</p>
          <p>m'7 A1 B3 D7
C6
m2 A2 B3 C6</p>
          <p>D7</p>
          <p>m'8 A2 B3 D7
m3 A4 B5 C6
m+4 A1 B3 C6 D7
D7
m+5 A2 B3 C6 D7
m+6 A4 B5 C6 D7</p>
          <p>Matches for p+ = ABCD
m'9 A4 B5 D7</p>
          <p>Matches for p' = ABD
matches is performed for both the extension + and the variation ′ , each update incurs the full cost of
a CEP match evaluation. This leads to poor scalability as the window grows, and makes the approach
suitable mainly as a correctness baseline rather than for continuous high-throughput processing.</p>
          <p>Portability. The Brute-Force algorithm is fully portable across CEP systems (e.g. SASE, openCEP,
FlinkCEP), as it requires only CEP functionality of match detection. No assumptions are made about
the internal representation of CEP engine (e.g. automata-based, tree-based etc.). For example, in SASE,
a second NFA could be instantiated whenever a new event type arrived in the system.
3.2.2. Optimization: Reverse Count
To improve the eficiency on pattern exploration, we present an optimization to the Brute-Force
algorithm, named Reverse Count (RC). The RC method avoids full CEP match computation and detection,
by counting the number of matches based on predecessor events. Given a candidate pattern  =
(⟨1, . . . , ⟩, , ), where  is the type of the newly arrived event , the algorithm begins from
this event and going backwards, to find the relevant preceding events of previous types in pattern. For
each relevant event retrieved, it computes recursively the count, until it reaches a the starting type, i.e.
−1 → −2 → · · · →  1.</p>
          <p>Algorithm 2: Reverse Count
Require: Window , pattern sequence  = ⟨1, . . . , ⟩, prefix  , event stream ,</p>
          <p>selection policy  ∈ {STNM, STAM}
1: for each event  ∈  do
2:  ← ()
3: if  ∈/  then
4: + ←  ‖ 
5: ′ ←  ‖ 
6: for each pattern  in {+, ′ } do
7:  = [1, 2, . . . , ] where  = 
8: /* Backwards retrieval stage */
9: Initialize [] ← {} // set (STNM) or multiset (STAM)
10: for  =  − 1 down to 1 do
11: [] ← ∅
12: for each event +1 ∈ [+1] do
13: Retrieve all events of type  in  that precede +1
14: Add them to [] // STAM: with duplicates; STNM: w/o</p>
          <p>duplicates
15:
16:
17:
18:
19:
20:
21:
/* Forward counting stage */
Count(2) ← |[ 1]|
for  = 3 to  do
for each event  ∈ [] do</p>
          <p>() ← ∑︀−1 ≺  (−1 )
TotalCount() ← ∑︀∈[] ()
Update trie path for  with TotalCount()
S = ⟨A1, CAO2U,NBT3(A,BAC46,)=B25+,1C=63, D7⟩</p>
          <p>C6 B3 B5
COUNT(AB3)=2 COUNT(AB5)=1
D7</p>
          <p>A1</p>
          <p>A2 A4
COUNT(ABCD7)=3</p>
          <p>C6
COUNT(ABC6)=2+1=3</p>
          <p>B3 B5
COUNT(AB3)=2 COUNT(AB5)=1</p>
          <p>A1</p>
          <p>A2</p>
          <p>A4</p>
          <p>COUNT(ABD7)=2+1=3
D7 B3 B5
COUNT(AB3)=2 COUNT(AB5)=1</p>
          <p>A1</p>
          <p>A2</p>
          <p>A4
p COUNT(ABC)=6</p>
          <p>p+ p'</p>
          <p>COUNT(ABCD)=3 COUNT(ABD)=3</p>
          <p>Consider the stream  = ⟨1, 2, 3, 4, 5, 6, 7⟩, and the user-defined pattern is  =
(⟨, , ⟩,  =   ,  = 10). When event 7 arrives, RC starts the exploration of + =
(⟨, , , ⟩,  =   ,  = 10), and therefore we have 4 = . The first step is to identify
and retrieve all preceding events of the previous type (i.e. 3 = ) that satisfy the window and policy
constraints, namely 6. For each of these predecessors, it computes recursively the count and sum
it up to the total count. The count of 6 will be computed based on the preceding events within 
under , namely 3 and 5. Then, respectively, the count of 3 will be computed based on the set of
preceding events of type , i.e., 1 and 2. Since  is the starting type, the recursion will stop, and will
return  = 1 for each event  in the set of s. Then, RC starts the forward counting stage, and
computes the corresponding counts. This step-by-step process is illustrated in Figure 4.</p>
          <p>This procedure is followed independently for both extension + and variation ′ of the pattern.</p>
          <p>Complexity. Unlike the Brute-Force method, which performs a full match detection for every update,
RC algorithm limits the processing only to the predecessor events relevant to the triggering event. As
a result, RC achieves substantially lower update latency, while still producing accurate match counts
even without reconstructing the actual matches.</p>
          <p>Portability. The RC algorithm remains portable across diferent CEP engines, since it requires simple
functionality of retrieving the predecessors of a specific type within a window and under a selection
policy. This can be implemented on top of any system like SASE or openCEP, without assuming any
particular internal representation structure. However, its eficiency depends on the data structures
provided by the system for the event storage. Engines that maintain their incoming events in time
ordered indices per type, like LimeCEP with TreeSet structures (STS), can provide eficiently an event’s
predecessors, while for engines that keep events in (not-indexed) lists, additional mechanisms may be
required to enable eficient retrieval.
3.2.3. Optimization alternative: Incremental Count
Although the RC algorithm has improved performance compared to the BF algorithm, by avoiding full
match detection, it still recomputes the count for predecessor events repeatedly, i.e., every time an event
appears in the predecessor set, RC recomputes its contribution by recursively traversing its previous
events. This results in significant overhead, since under the  consumption policy adopted in this
work, each event can participate in multiple matches.</p>
          <p>Algorithm 3: Incremental Count
Require: Window , pattern  = [1, . . . , ], prefix  , event stream ,
selec</p>
          <p>tion policy  ∈ {STNM, STAM}
1: for each event  ∈  do
2:  ← ()
3: if  ∈/  then
4: + ← ‖
5: ′ ←  ‖
6: for each pattern  in {+, ′ } do
7: Let  = [1, 2, . . . , ] where  = 
8:   ←  −1
9: /* Retrieve relevant events */
10:   ← all events of type   in  preceding 
11: /* Compute Window-correct counts */
12: for each event  ∈   do
13:  ← .
// stored in STS, computed at insertion under , 
// time diference</p>
          <p>S = ⟨A1, A2, B3, A4, B5, C6, D7⟩
STS
A A1 A2 A4
B
C
D
C6
D7
D7</p>
          <p>B5 1
B3 2
C6 3
D7 CCOOUUNNTT((AABBCDD))
B3 B5</p>
          <p>2 + 1 = 3
C6</p>
          <p>3
B3 B5</p>
          <p>2 + 1 = 3</p>
          <p>To eliminate this, we introduce Incremental Count (IC) algorithm, which leverages the counts already
stored in the internal structure of the CEP system, along with each event. More specifically, we modified
the LimeCEP’s internal engine to compute this score once, on event arrival, for the user-defined pattern
and under the constraints imposed by it. The pre-computed counts are stored in the internal structure of
LimeCEP, namely STS, alongside with the events, so that they are easily accessible; therefore, whenever
a later computation requires the contribution of a specific event, IC directly retrieves this stored count
from the STS. In cases where a predecessor event  is close to the window boundary, a part of its
stored count may not be valid for the currently processed event. Instead of recomputing the count for
 from scratch, the IC algorithm applies a lightweight adaptation to the count that excludes expired
14:
15:
16:
17:
18:
19:
20:
Δ ← . − .
_ ← (0,  − Δ)
() ← (, _)
/* Compute count for  */
() ← ∑︀∈  ()
STS.add( , , ())</p>
          <p>Update trie path for  with ()
contributing events. Intuitively, IC checks the time distance of  from the end of the current window
defined by the triggering event  as Δ = . − . , and then subtracts this diference from the
window size _ = (0,  − Δ) . Then, the window-correct count for this event  is computed
as corrected() = min(︀ count(), _︀) , therefore reducing its stored count by the amount that
falls outside the valid time range, allowing IC to maintain correct match counts.</p>
          <p>In the previous example, when 3 arrives, the system retrieves its predecessors ⟨1, 2⟩), computes
its count as 2, and stores it. The same applies to 5, whose count is 1. When 6 arrives, IC identifies its
predecessors ⟨3 and 5⟩, but instead of starting a recursive process similar to RC algorithm, it directly
fetches their stored counts and sums them, immediately computing the corresponding () for
event 6, that is (6) = 2 + 1 = 3. This value is then stored in the STS. When 7 arrives, IC again
retrieves the relevant predecessors (only 6 event for the extension pattern , and events ⟨3, 5⟩
for the variation pattern ) and computes their total by adding the previously stored counts. The
resulting counts are then inserted into the pattern trie. The step-by-step execution is illustrated in
Figure 5. Regarding the window-correct counts, if an event 12 arrives, the IC will explore the candidate
pattern  and retrieve the relevant predecessors of 12 for type  including 3. Even though 3 is a
relevant preceding event, its count contains invalid prefix contributing (i.e. match ⟨1, 3⟩). To fix this,
IC follows the process described above, computing the the invalid contributions, and correcting the count
for 3 using the formula (3) ← ( 3.,  − ( 12. −  3.)) = (2, 10 − (12 − 3)) =
(2, 1) = 1. Therefore, the window-correct count for 3 when the end event is 12 is 1 instead of 2.</p>
          <p>Complexity. The IC algorithm avoids the redundant recomputations performed by the RC algorithm
and reduces the update cost to the cost of retrieving a small number of stored counts from the STS.
Since predecessor sets are substantially smaller than the full window, IC provides the best scalability
among the three update methods proposed.</p>
          <p>Portability. The IC algorithm can, in practice, be implemented on top of any CEP engine with some
modifications since it conceptually requires only: (i) retrieval of predecessor events under a window
and selection policy, and (ii) the capability to maintain per-event counts. However, achieving the same
eficiency as in LimeCEP is not guaranteed. IC relies heavily on LimeCEP’s TreeSet-based event storage
structure, which provides logarithmic-time (log ) access to predecessor events. Systems that do not
maintain a similar indexed time-ordered structure, such as SASE, would require additional and more
sophisticated mechanisms to support IC eficiently.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Evaluation</title>
      <p>Setup, Competitors, Queries and Datasets. All experiments were conducted on LimeCEP extended
with the proposed PTM and pattern trie, running on a single machine equipped with 32,GB RAM and
an 8-core 3.0GHz CPU. Each experiment was repeated five times, and we report median results. We
evaluate the runtime overhead and mining behavior introduced by automated pattern exploration under
varying event streams, window sizes, and configurations. We compare four configurations: (i) LC,
LimeCEP without pattern exploration; (ii) BF, the brute-force approach; (iii) RC, the reverse-count
approach; and (iv) IC, the incremental-count approach. Their implementation details are described in
Section 3.2. The evaluation uses three synthetic event streams: A–F with 100,000 events over 6 event
types (– ), A–K with 100,000 events over 11 types (–), and A–J with 1,000,000 events over 10
types (– ). We also use a real-world stream derived from Google Cluster traces [17], containing 9
event types corresponding to task life-cycle state transitions. For the synthetic streams, we evaluate the
pattern query 1 = (⟨, , ⟩,  = STNM, ), with window sizes  ∈ 0.1K, 1K, 10K minutes. For the
real-world dataset, we use 2 = (⟨Submit, Enable, Schedule⟩,  = STNM,  = 100).</p>
      <p>Execution Time. Figure 6 depicts the execution time and throughput under diferent configurations
and window sizes for the synthetic data streams (Figure 6a) and the Google Cluster traces (Figure 6b).</p>
      <p>In synthetic streams, the brute-force approach exhibits a substantially rapid increase in execution time
as the window size increases, which leads to performance degradation. This is more pronounced in the
second stream. Although execution time increases for all approaches as the window size increases, the</p>
      <p>A-F Stream</p>
      <p>A-J Stream</p>
      <p>A-K Stream
)
c
se1600
/
s
tn1400
e
ve1200
(
tu1000
p
gh800
u
ro600
h
T
LC</p>
      <p>BF</p>
      <p>RC</p>
      <p>IC</p>
      <p>BF</p>
      <p>RC</p>
      <p>IC
(b) Real-world event stream
performance of BF is significantly worse than optimized alternatives. This can also be verified from the
throughput figures, where BF has the lowest throughput. This behavior is expected, as BF recomputes all
valid matches from scratch for each explored candidate pattern, causing its cost to grow incrementally
with increasing window sizes, especially when the number of matches increases. Contrary, RC and IC,
by exploiting predecessor counts, they maintain substantially more stable execution times than BF, while
following the same throughput trend as LC across all window sizes. The LC approach represents the
lower bound in runtime performance, executing the original CEP query without mining overhead, while
exhibiting the same window-dependent throughput degradation inherent to LimeCEP. RC and IC exhibit
comparable performance on all synthetic datasets. This result is expected, since RC already operates
eficiently by traversing preceding events. IC approach is not designed to universally outperform RC,
rather to eliminate repeated recomputation by reusing prefix counts maintained within the CEP engine.</p>
      <p>On the real-world dataset (Figure 6b), BF fails to execute the detection and is aborted after 5 hours,
whereas, both RC and IC complete execution within approximately 1 minute and maintain comparable
throughput to LC, demonstrating the practicality of automated pattern exploration under realistic event
streams. However, their relative performance difers with IC exhibiting lower execution time (61 vs.
65.5 s), achieving 7% higher throughput (1639 vs. 1526 events/second).</p>
      <p>Memory &amp; CPU Regarding the memory consumption, both figures for synthetic (Figure 7a) and
real-world streams (Figure 7b) highlight the overhead of full match detection that BF performs in
order to explore the candidate patterns, against the more eficient count-based RC and IC solutions.
BF consistently exhibits the highest memory usage across all experiments, with the size of memory
consumption growing rapidly as window size increases. In contrast, RC and IC algorithms maintain a
compact internal state, therefore their memory footprint increases more gradually with the window
size, and remains consistently lower than that of BF. In some cases, however, IC exhibits higher memory
overhead than RC algorithm. This diference stems from the additional metadata maintained by IC in
memory, namely pre-stored prefix counts for each event. The benefit of storing these counts depends
on the degree of reuse of events across multiple matches. The higher the number of matches an event
participates in, the more times the same count would need to be recomputed, and consequently, the
greater the benefit of having such counts pre-stored. This trade-of suggests that the IC algorithm is
particularly beneficial in settings with high match overlap, such as under the STAM selection policy
(no experiments are presented due to space limitations).</p>
      <p>CPU utilization exhibits a complementary trend, with BF having higher and more variable CPU usage,
due to repeated full match detection across multiple candidate pattern evolutions. In contrast, RC and IC
Method</p>
      <p>LC
BF
RC</p>
      <p>IC
Method</p>
      <p>LC
BF
RC
IC
(a) Synthetic event streams
(b) Real-world event stream
demonstrate more stable performance across window sizes, with IC incurring slightly higher CPU usage
than RC, due to additional processing required to maintain and update the incremental counts. This
overhead, however, is not translated into increased execution time or additional throughput degradation.
Moreover, as discussed earlier, IC is expected to be more advantageous in more demanding settings,
e.g. under STAM policy, where the cost of repeatedly computing counts outweighs the overhead of
maintaining them.</p>
      <p>Mining Process. To further analyze the behavior of the mining process itself, we evaluate the
cost of pattern exploration under a more demanding configuration, using the STAM selection policy.
We consider the pattern 1 = (⟨, , ⟩,  = STAM,  = 100) over the A-K synthetic stream. Under
this configuration, BF detects approximately 9.9 × 10 7 matches, while LC, RC, and IC detect 4.9 × 10 5
matches corresponding to the original pattern execution. RC and IC further identify without explicitly
detecting that there are also 8.6 × 10 6 pattern variations and 9 × 10 7 pattern extensions. The brute-force
approach requires approx. 2K seconds to complete and consumes approximately 500 MB of memory.
In contrast, both RC and IC complete execution within approximately one minute, with RC requiring
74.6 seconds and IC 71 seconds, respectively. These results highlight a substantial speedup, about 27×
for IC over BF and 26× for RC over BF, confirming that complete match detection is infeasible under
high-overlap workloads. To highlight the diferences between RC and IC under higher overlap, we
further executed the real-world query 2 under  = STAM with a larger window  = 1000. RC requires
193 minutes and achieves 8.64 events/second, whereas IC requires 121 minutes and achieves 13.74
events/second, corresponding to a 37% reduction in execution time and a 59% increase in throughput.
IC is not intended to universally outperform RC; rather, it eliminates repeated backward traversals by
reusing prefix counts maintained within the CEP engine.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Related Work</title>
      <p>Existing CEP systems, such as SASE[18], Cayuga[19], Esper2, and FlinkCEP3, have established the
main execution paradigms for pattern detection over event streams, using mainly NFAs, with selection
and consumption policies on match construction. Other works, such as ZStream[20], CET[21],
CORECEP[22] and T-REX[23] aim at improving the performance of CEP systems through structural sharing</p>
      <sec id="sec-5-1">
        <title>2https://www.espertech.com/esper/</title>
        <p>
          3https://nightlies.apache.org/flink/flink-docs-master/docs/libs/cep/
and state reduction, by using tree- and hash-based structures for internal representation. Lazy evaluation
has also been explored in OpenCEP[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], to reduce computations and intermediate results, therefore
reducing the detection latency of a match. Formal semantic extensions, such as Symbolic Register
Transducers[24], provide declarative and compositional pattern semantics, enhancing the expressiveness
of CEP engines. Handling eficiently kleene operators has motivated several optimizations, such as
[
          <xref ref-type="bibr" rid="ref11 ref9">9, 25, 11, 26, 27</xref>
          ], for dealing with the exponential growth of internal state and intermediate results.
Robust execution under real-world inconsistencies and possible disorders in the incoming stream has
been explored in [
          <xref ref-type="bibr" rid="ref14 ref15">28, 15, 29, 14</xref>
          ], with probabilistic windowing, speculative processing and replay-based
processing are exploited in combination, to achieve low-latency and correct detection of matches.
More recent works, such as [30, 31, 32], aim to enhance the throughput of CEP systems, by exploiting
parallelization techniques, load balancing, and query decomposition. Each of these works optimize a
diferent aspect on the evaluation process of a CEP system, for a specific user-defined pattern query, yet,
none of them provides a mechanism for online pattern extraction during the CEP execution process.
        </p>
        <p>Sequential Pattern Mining (SPM) and Episode Mining (EM) aim to extract recurring temporal
structures from event sequences. Surveys cover methods for serialized, parallel, and generalized episodes,
often with gap, window, or other temporal constraints[33]. Most of these methods, such as [34], run
ofline, expect event ordering, and do not account for CEP-specific semantics such as selection and
consumption policies, maximal matches, or robustness to disorder.</p>
        <p>Prior works attempt to bridge this gap by learning CEP rules from historical traces. For example,
iCEP[35] learns rule candidates from historical traces using constraint-based inference, DISCES[36]
systematically explores the space of possible query designs through labeled historical sub-streams.
CBDeclare [37], extends DECLARE with branched constraints capturing AND/OR/XOR relationships
and while providing also matching discovery algorithms. A very recent work [38], moves closer to our
setting by discovering CEP queries from operational data, using evolutionary computation. Although
the approach detects candidate patterns using FlinkCEP, this is performed outside the CEP engine,
requires a sampled historical dataset and expert-labeled sequences. In contrast, our work performs
online discovery of candidate patterns within the CEP engine, exploring frequent pattern evolutions,
consistent with original pattern’s CEP semantics, maintaining the throughput characteristics of the
underlying CEP engine.</p>
        <p>Finally, there are methods that learn rule patterns from historical traces using classical ML models [39,
40], while more recent frameworks employ deep networks to extract CEP rules for IoT streams [41].
Reinforcement-learning techniques [42] adjust thresholds of existing CEP rules at runtime using deep
Q-networks on top of FlinkCEP. In parallel, deep models such as [43] support event forecasting and
runtime adaptation by coupling neural predictors with CEP engines [44, 45]. While ML and DL methods
enhance CEP through forecasting and parameter adaptation, they do not explore new event-type
sequences nor evolve the structure of existing patterns, as our proposal does.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions and Future Work</title>
      <p>In this work, we presented an add-on mechanism for CEP systems that enables the online exploration of
frequent variations and/or continuations of a user-defined pattern during query execution. Our approach
employs a lightweight pattern-trie structure and we have also introduced three update algorithms:
a brute-force method based on classical CEP detection and two optimized alternatives that exploit
predecessor event counts and are an order of magnitude faster. Our design allows pattern exploration
to be performed incrementally within the CEP engine. Our experimental evaluation on synthetic and
real-world streams indicates that online exploration of pattern evolutions can be integrated into CEP
execution without significant additional impact on execution time or throughput, even under large
windows and high-throughput workloads. Future work includes extending the current framework to
also consider rare pattern evolutions and to support pattern transformations beyond extensions and
variations of the last event type.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The authors cordially thank Christos Balaktsis for valuable discussions and brainstorming during the
development of this work.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used OpenAI-GPT-5: Improve writing style and
Grammar and spelling check. After using these, the authors reviewed and edited the content as needed
and take full responsibility for the publication’s content.
[16] A. Ziehn, P. M. Grulich, S. Zeuch, V. Markl, Bridging the gap: Complex event processing on stream
processing systems., in: EDBT, 2024, pp. 447–460.
[17] C. Reiss, J. Wilkes, J. L. Hellerstein, Google cluster-usage traces: format+ schema, Google Inc.,</p>
      <p>White Paper 1 (2011) 1–14.
[18] H. Zhang, Y. Diao, N. Immerman, On complexity and optimization of expensive queries in
complex event processing, in: Proceedings of the 2014 ACM SIGMOD international conference on
Management of data, 2014, pp. 217–228.
[19] A. J. Demers, J. Gehrke, B. Panda, M. Riedewald, V. Sharma, W. M. White, et al., Cayuga: A general
purpose event monitoring system., in: Cidr, volume 7, 2007, pp. 412–422.
[20] Y. Mei, S. Madden, Zstream: a cost-based query processor for adaptively detecting composite
events, in: Proc. of SIGMOD, 2009, pp. 193–206.
[21] O. Poppe, C. Lei, S. Ahmed, E. A. Rundensteiner, Complete event trend detection in high-rate
event streams, in: Proceedings of the 2017 ACM International Conference on Management of
Data, 2017, pp. 109–124.
[22] M. Bucchi, A. Grez, A. Quintana, C. Riveros, S. Vansummeren, Core: a complex event recognition
engine, arXiv preprint arXiv:2111.04635 (2021).
[23] G. Cugola, A. Margara, Complex event processing with t-rex, Journal of Systems and Software 85
(2012) 1709–1728.
[24] E. Alevizos, A. Artikis, G. Paliouras, Complex event recognition with symbolic register transducers,</p>
      <p>Proceedings of the VLDB Endowment 17 (2024) 3165–3177.
[25] A. Rozet, O. Poppe, C. Lei, E. A. Rundensteiner, Muse: Multi-query event trend aggregation, in:
Proceedings of the 29th ACM International Conference on Information &amp; Knowledge Management,
2020, pp. 2193–2196.
[26] O. Poppe, C. Lei, E. A. Rundensteiner, D. Maier, Event trend aggregation under rich event matching
semantics, in: Proceedings of the 2019 International Conference on Management of Data, 2019,
pp. 555–572.
[27] O. Poppe, A. Rozet, C. Lei, E. A. Rundensteiner, D. Maier, Sharon: Shared online event sequence
aggregation, in: 2018 IEEE 34th International Conference on Data Engineering (ICDE), IEEE, 2018,
pp. 737–748.
[28] N. Rivetti, N. Zacheilas, A. Gal, V. Kalogeraki, Probabilistic management of late arrival of events, in:
Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems,
2018, pp. 52–63.
[29] C. Mutschler, M. Philippsen, Adaptive speculative processing of out-of-order event streams, ACM</p>
      <p>Transactions on Internet Technology (TOIT) 14 (2014) 1–24.
[30] S. Akili, S. Purtzel, M. Weidlich, Decopa: query decomposition for parallel complex event
processing, Proceedings of the ACM on Management of Data 2 (2024) 1–26.
[31] M. Yankovitch, I. Kolchinsky, A. Schuster, Hypersonic: A hybrid parallelization approach for
scalable complex event processing, in: Proceedings of the 2022 International Conference on
Management of Data, 2022, pp. 1093–1107.
[32] K. Chapnik, I. Kolchinsky, A. Schuster, Darling: data-aware load shedding in complex event
processing systems, Proceedings of the VLDB Endowment 15 (2021) 541–554.
[33] O. Ouarem, F. Nouioua, P. Fournier-Viger, A survey of episode mining, Wiley Interdisciplinary</p>
      <p>Reviews: Data Mining and Knowledge Discovery 14 (2024) e1524.
[34] S. B. Gandreti, P. Sastry, Eficient depth-first search approach for mining frequent chain episodes,
in: Proceedings of the 8th International Conference on Data Science and Management of Data
(12th ACM IKDD CODS and 30th COMAD), 2024, pp. 66–74.
[35] A. Margara, G. Cugola, G. Tamburrelli, Learning from the past: automated rule generation for
complex event processing, in: Proceedings of the 8th ACM international conference on distributed
event-based systems, 2014, pp. 47–58.
[36] R. Sattler, S. Kleest-Meißner, S. Lange, M. L. Schmid, N. Schweikardt, M. Weidlich, Disces:
systematic discovery of event stream queries, Proceedings of the ACM on Management of
Data 3 (2025) 1–26.
[37] C. Balaktsis, I. Mavroudopoulos, M. Comuzzi, A. Gounaris, F. M. Maggi, Discovering comprehensive
branched declarative process constraints, in: International Conference on Business Process
Management, Springer, 2025, pp. 147–164.
[38] G. Appetito, E. Medvet, V. Gulisano, Automated discovery of cep applications with evolutionary
computing, in: Proceedings of the 19th ACM International Conference on Distributed and
Eventbased Systems, 2025, pp. 33–38.
[39] N. Mehdiyev, J. Krumeich, D. Enke, D. Werth, P. Loos, Determination of rule patterns in complex
event processing using machine learning techniques, Procedia Computer Science 61 (2015) 395–401.
[40] R. Mousheimish, Y. Taher, K. Zeitouni, Autocep: automatic learning of predictive rules for complex
event processing, in: International conference on service-oriented computing, Springer, 2016, pp.
586–593.
[41] M. U. Simsek, F. Yildirim Okay, S. Ozdemir, A deep learning-based cep rule extraction framework
for iot data., Journal of Supercomputing 77 (2021).
[42] A. Mdhafar, G. Baklouti, Y. Rebai, M. Jmaiel, B. Freisleben, Rl4cep: reinforcement learning for
updating cep rules, Complex &amp; Intelligent Systems 11 (2025) 137.
[43] T. Xing, M. R. Vilamala, L. Garcia, F. Cerutti, L. Kaplan, A. Preece, M. Srivastava, Deepcep: Deep
complex event processing using distributed multimodal information, in: 2019 IEEE international
conference on smart computing (SMARTCOMP), IEEE, 2019, pp. 87–92.
[44] V. Stavropoulos, E. Alevizos, N. Giatrakos, A. Artikis, Optimizing complex event forecasting, in:
Proceedings of the 16th ACM International Conference on Distributed and Event-Based Systems,
2022, pp. 19–30.
[45] M. Pitsikalis, E. Alevizos, N. Giatrakos, A. Artikis, Run-time adaptation of complex event
forecasting, in: Proceedings of the 19th ACM International Conference on Distributed and Event-based
Systems, 2025, pp. 9–20.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Giatrakos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Alevizos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Artikis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Deligiannakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Garofalakis</surname>
          </string-name>
          ,
          <article-title>Complex event recognition in the big data era: a survey</article-title>
          ,
          <source>The VLDB Journal</source>
          <volume>29</volume>
          (
          <year>2020</year>
          )
          <fpage>313</fpage>
          -
          <lpage>352</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fragkoulis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Carbone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kalavri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Katsifodimos</surname>
          </string-name>
          ,
          <article-title>A survey on the evolution of stream processing systems</article-title>
          ,
          <source>The VLDB Journal</source>
          <volume>33</volume>
          (
          <year>2024</year>
          )
          <fpage>507</fpage>
          -
          <lpage>541</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>I.</given-names>
            <surname>Mavroudopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tsichlas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gounaris</surname>
          </string-name>
          ,
          <article-title>Sequential pattern detection: similarities and diferences across various fields: I. mavroudopoulos et al</article-title>
          .,
          <source>Data Mining and Knowledge Discovery</source>
          <volume>39</volume>
          (
          <year>2025</year>
          )
          <fpage>38</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Lan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <article-title>A universal complex event processing mechanism based on edge computing for internet of things real-time monitoring</article-title>
          ,
          <source>IEEE Access 7</source>
          (
          <year>2019</year>
          )
          <fpage>101865</fpage>
          -
          <lpage>101878</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dhillon</surname>
          </string-name>
          ,
          <article-title>Mcep: a mobile device based complex event processing system for remote healthcare</article-title>
          ,
          <source>in: 2018 IEEE International Conference on Internet of Things (iThings)</source>
          ,
          <article-title>Green Computing and Communications (GreenCom), Cyber, Physical and Social Computing (CPSCom) and Smart Data (SmartData)</article-title>
          , IEEE,
          <year>2018</year>
          , pp.
          <fpage>203</fpage>
          -
          <lpage>210</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Xhafa</surname>
          </string-name>
          ,
          <article-title>Anomaly detection and complex event processing over iot data streams: with application to EHealth and patient data monitoring</article-title>
          , Academic Press,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Roldán</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Boubeta-Puig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Martínez</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Ortiz, Integrating complex event processing and machine learning: An intelligent architecture for detecting iot security attacks</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>149</volume>
          (
          <year>2020</year>
          )
          <fpage>113251</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bucchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Grez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Quintana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Riveros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vansummeren</surname>
          </string-name>
          ,
          <article-title>Core: a complex event recognition engine</article-title>
          ,
          <source>arXiv preprint arXiv:2111.04635</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kyrama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gounaris</surname>
          </string-name>
          ,
          <article-title>Exploring alternatives of complex event processing execution engines in demanding cases</article-title>
          ,
          <source>in: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>313</fpage>
          -
          <lpage>320</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>I.</given-names>
            <surname>Kolchinsky</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sharfman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Schuster</surname>
          </string-name>
          ,
          <article-title>Lazy evaluation methods for detecting complex events</article-title>
          ,
          <source>in: Proc. of DEBS</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>34</fpage>
          -
          <lpage>45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>O.</given-names>
            <surname>Poppe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rozet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Rundensteiner</surname>
          </string-name>
          ,
          <article-title>To share, or not to share online event trend aggregation over bursty event streams</article-title>
          ,
          <source>in: Proceedings of the 2021 International Conference on Management of Data</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1452</fpage>
          -
          <lpage>1464</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>N.</given-names>
            <surname>Rivetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Zacheilas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kalogeraki</surname>
          </string-name>
          ,
          <article-title>Probabilistic management of late arrival of events</article-title>
          ,
          <source>in: Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>52</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kyrama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gounaris</surname>
          </string-name>
          ,
          <article-title>Handling out-of-order input arrival in cep engines on the edge combining optimistic, pessimistic and lazy evaluation</article-title>
          ,
          <source>arXiv preprint arXiv:2507.01461</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Trisminingsih</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bou</surname>
          </string-name>
          , T. Amagasa,
          <article-title>Eficient pattern matching over out-of-order event streams using sliding bufer</article-title>
          ,
          <source>Journal of Information Processing</source>
          <volume>32</volume>
          (
          <year>2024</year>
          )
          <fpage>963</fpage>
          -
          <lpage>972</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kyrama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gounaris</surname>
          </string-name>
          ,
          <article-title>Handling out-of-order input arrival in cep engines on the edge combining optimistic, pessimistic and lazy evaluation</article-title>
          ,
          <source>arXiv preprint arXiv:2507.01461</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>