<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards an Integrated Solution for IoT Data Management</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anderson Chaves Supervised by Fabio Porto LNCC</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brazil achaves@lncc.br</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>The emergence of Big Data and the Internet of Things (IoT) is increasingly afecting all areas of modern society, being characterized by a huge number of data streams that demand real-time processing and analysis. The development of systems to assist on the management of these data streams plays an important role for IoT applications. However, there are numerous challenges that must be taken into account when building an eficient data system for handling large scale, dynamic, semi-structured data such as IoT, and currently existing solutions only partially address the requirements of these scenarios. In this PhD research, we summarize some of the main challenges involved in building an eficient system for IoT data management and analysis, and how diferent data management approaches such as Actor oriented, Array and Active Databases fit together ofering strong contributions to these requirements. We also examine the potential of performing Machine Learning inference and handling Concept Drift in IoT as an integrated database process. Through this work, we lay the structure for the development of a Database Management System to support large scale data stream based analysis capable of combining these diferent strategies.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        From smart homes control systems to transportation, healthcare
and industrial automation, the Internet of Things has been enabling
great benefits both for individual and businesses, being used for
better decision making, planning and higher productivity [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The
main characteristics behind this IoT paradigm is the exploration of
diferent technologies such as communication, embedded systems
and data analytics in order to create smart devices for intelligent
monitoring, locating, tracking and so forth [
        <xref ref-type="bibr" rid="ref18 ref9">9, 18</xref>
        ].
      </p>
      <p>The eficient management of sensor data from IoT devices is
essential to perform IoT data analysis. Through Complex Event
Processing (CEP) methods, it is possible to detect anomalies and
meaningful events from data streams and perform real-time
decision making. However, processing and analyzing continuous data
streams from heterogeneous networks still leads to a number of
different challenges, and requires the development of new techniques
and strategies.</p>
      <p>
        A major challenge in an IoT environment is related to its large
scale data flows. Data in IoT can have its sources in a very big
range of endpoints that generate masses of data, and is frequently
semi-structured or unstructured, conforming it to the Big Data
paradigm [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Traditional DBMSs, which need to store and index data
before processing it, cannot fulfill the requirements of timeliness
and scalability of IoT data streams [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Besides, in order to perform
analysis and visualization, existing solutions are often ineficient,
because they incur in an incompatibility between the structure of
the source data and the analysis tool [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Finally, there are a number
of privacy and security issues as well as resource constraints such
as memory, bandwidth and energy that must be taken into account
when building an IoT data management system.
      </p>
      <p>
        Another challenge in IoT is the necessity for on-line processing
of data streams as opposed to of-line analysis. Machine learning
(ML) is one of the leading strategies to perform reliable, eficient
real-time analysis of IoT data in tasks such as predictions or
anomalies detection [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. However, the lack of integration between the ML
application and the data system is often a restraint to performance
improvements, since optimizations such as query planning or lazy
evaluation are not possible when the two processes are treated
as completely isolated tasks [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Additionally, when dealing with
dynamic stream data such as IoT, the nature of the data distribution
tends to change over time, resulting in the phenomenon known as
concept drift. It occurs when the statistical properties of the target
variable, which the model is trying to predict, change over time in
unforeseen ways [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. When that happens, the learned patterns of
past data may not be relevant to the new data, leading to poor
predictions and incorrect decisions. Machine Learning based analysis
needs to be able not only to detect the drift, but also understand
and react to it.
      </p>
      <p>
        We argue that data management systems demand eficient
mechanisms to deal with large-scale, heterogeneous IoT data. A
recent work [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] has demonstrated that the programming model
aimed specifically at concurrency and inherent parallelism of
actororiented databases such as Orleans [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and ReactDB [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] is an
adequate solution for systems focused on IoT data management.
Reactive behavior and CEP techniques are also essential for
evaluating complex patterns over high-throughput data streams such
as IoT [
        <xref ref-type="bibr" rid="ref13 ref21">13, 21</xref>
        ]. Since a large part of data made available by IoT
devices is multidimensional spatio-temporal [
        <xref ref-type="bibr" rid="ref19 ref9">9, 19</xref>
        ], multidimensional
array data models could provide great advantages to its
management [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. However, managing several diferent platforms instead
of one makes the resulting solution unnecessarily complex and
potentially ineficient. To the best of our knowledge, no existing
solution has been yet proposed to combine all these approaches for
IoT Scenarios.
      </p>
      <p>Therefore, to address the challenges involved in the development
of an adequate IoT solution, we envision a Database Management
System capable of ofering scalable support for IoT data
management as well as analysis through Machine Learning. In this work,
we present the following contributions:</p>
      <sec id="sec-1-1">
        <title>System Features</title>
      </sec>
      <sec id="sec-1-2">
        <title>Actor Oriented</title>
      </sec>
      <sec id="sec-1-3">
        <title>Databases</title>
      </sec>
      <sec id="sec-1-4">
        <title>Array</title>
      </sec>
      <sec id="sec-1-5">
        <title>Databases</title>
      </sec>
      <sec id="sec-1-6">
        <title>Active</title>
      </sec>
      <sec id="sec-1-7">
        <title>Databases</title>
      </sec>
      <sec id="sec-1-8">
        <title>Proposed</title>
      </sec>
      <sec id="sec-1-9">
        <title>Solution</title>
      </sec>
      <sec id="sec-1-10">
        <title>Actor-Based</title>
      </sec>
      <sec id="sec-1-11">
        <title>Programming</title>
        <p>+</p>
        <p>Dynamic Scalability
Asynchronous primitives</p>
        <p>Encapsulation
Array-Based Operations
Flexible Storage Format</p>
        <p>Event Detection</p>
        <p>Reactive Behavior
ML as first class operations</p>
        <p>Concept Drift Handling
+
+
+
+
• We propose the development of a new Database
Management System that ofers CEP primitives through actor-based
programming in order to perform rule-based monitoring for
real-time scalable IoT scenarios.
• We propose to further extend our solution to include ML
inference as first class operators for CEP, enabling further
integration between the data system and the Machine Learning
tasks.
• We propose to investigate the challenges involved in concept
drift handling specifically in an IoT environment, and how
to address these challenges in a data management system.</p>
        <p>The remainder of this paper is organized as follows. In Section
2 we present the base concepts for the highlighted problems and
proposed solutions. In Section 3 we present our idea of leveraging
array databases to a scalable, reactive and intelligent solution fit for
IoT. We conclude and present our research directions in Section 4.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>RESEARCH CONTEXT</title>
      <p>In this section, we introduce the base concepts of IoT data and
challenges related to it. Afterward, we present the diferent database
models that serve as foundation to the proposed solution. Finally,
we describe the problem of Concept Drift in IoT context.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>IoT Big Data Challenges</title>
      <p>
        According to [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], big data in IoT has three features that conform
to the big data paradigm: (a) a very big range of endpoints that
generate masses of data; (b) semi-structured or unstructured data;
(c) it is only useful after being analyzed.
      </p>
      <p>Data generated by IoT has usually a high number of parallel
sources, being subject to inaccuracies and noise during acquisition
and transmission. It can be streamed continuously or accumulated
as a source of big data. When dealing with big data analytics, its
possible to produce insights after several days of its generation, but
in the case of streaming data IoT analytics, they must be delivered
in at most a few seconds or less. This real-time constraint incur in
the following challenges for IoT big data:</p>
      <p>
        Data Management: Data management is a big challenge to be
addressed in order to realize the full potential of IoT, and therefore
has become a key research topic [
        <xref ref-type="bibr" rid="ref17 ref20">17, 20</xref>
        ]. Many IoT systems are
processor-intensive and require processing a massive amount of
highly concurrently generated data. How to perform the
management of these data interactions while ensuring low latency?
      </p>
      <p>
        Visualization: Visualization is important in big data analytics,
specially for IoT systems [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. How can we perform visualization in
the case of heterogeneous and diversely structured data generated
in IoT?
      </p>
      <p>Data Mining: The realization of the potential of IoT depends on
being able to gain the insights hidden in the vast and ever increasing
available data. Current data mining approaches don’t scale well
to IoT volumes. What characteristics are the most essential for a
system fit to such environments?</p>
      <p>Resource Constraints: In the IoT data stream model, a high
volume of data is produced at high speed. Therefore algorithms
that process it must do so under very strict constraints of space
and time. Addressing these constraints requires that a significant
amount of data processing must happen on edge devices. How can
we design algorithms that work eficiently in such environments?</p>
      <p>Security: Being able to deal with dynamic scaling while
guaranteeing protection of data from diferent entities is another
significant challenge. What is the most efective way to ensure access
control and protection of data from large volumes of devices and,
at the same time, ensure the development of a dynamic and flexible
application?
2.2</p>
      <p>Data management solutions
2.2.1 Array Database Models. Most IoT environments are
constituted by static or moving sensor devices placed in specific locations
that produce data continuously. Each data item has space
coordinates as well as a time-stamp associated, incurring in a high time
and space correlation. Because of this multidimensional
spatiotemporal nature of IoT data, multidimensional array database
models, built using arrays as the primary data representation, ofer
advantages for an eficient data management.</p>
      <p>
        Array databases were initially proposed to better represent
sensor, image, simulation, and statistics data of tipically spatio-temporal
dimensions [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. They have special query languages built upon
arraybased algebraic formalizations that model diferent kinds of
operations such as aggregations or subsetting. Cells in an array have
an intrinsic ordering, making it easy to quickly lookup values by
taking advantage of this ordering. Array indexes do not need to be
stored and can be inferred by the position of a cell, saving storage
space. Arrays can also be split into subarrays (called tiles or chunks)
that can be used as processing and storage units to help answering
queries eficiently.
      </p>
      <p>
        Recently, some research efort is being applied in order to
integrate ML tools and array DMBSs [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. The system Rasdaman [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
allows the implementation of machine learning algorithms through
User Defined Types and Functions that implement the underlying
linear algebra operations directly over the arrays. In the case of
SciDB [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], users are provided with linear algebra operators that
can be used as building blocks to implement the ML algorithms.
In SAVIME [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], users can perform inference from machine
learning models as part of the query expression, allowing the jointly
optimization of the data preparation process and its input to the
model.
2.2.2 Active Databases and Complex Event Processing. An event can
be defined as an occurrence of significance in a system [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
Historically, many diferent initiatives have studied event processing for
diferent reasons. Active Databases intended to extend traditional
DBMSs by enabling the specification of reactive behavior. The idea
was to develop strategies to respond automatically to events and
changes in the database state through mechanisms formalized as
ECA rules [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]: if an event is detected, and any of previously
deifned conditions become true, then a corresponding action is taken
without any external intervention.
      </p>
      <p>
        Complex Event Processing extend the logic behind ECA rules,
being understood as a set of techniques combined in order to
perform real-time stream processing for monitoring and detection of
arbitrarily complex patterns in massive data streams [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. They
are commonly used in IoT environments to enable real-time or
near real-time decisions [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In CEP, each data item is abstracted
as an event produced by a data source. A CEP engine combines
multiple simpler events to produce more complex ones, that match
previously defined patterns. It typically must process multiple data
streams from diferent sources in order to track simultaneously
hundreds or even thousands of diferent patterns through
evaluation mechanisms such as non-deterministic finite automaton or
tree-based plans [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
2.2.3 Actor Oriented Databases. The actor programming model is
a well-known model for distributed and concurrent programming,
in which the actor is the fundamental computing unit. Its main
principle is that in a system, the control flow and the data flow
must be inseparable. Actors do not share state and communicate
via asynchronous messages. Because of its characteristics, actors
are a scalable solution to support the management of any number
of independent and heterogeneous streaming data sources.
      </p>
      <p>
        In recent works, it has been demonstrated the efectiveness of the
integration of data management features such as transactions and
indexing into actor runtimes [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The authors of [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] demonstrate
that this solution is in fact very suitable to perform IoT data
management. A similar approach has sought to integrate actor primitives
into relational databases [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] by extending the programmability of
stored procedures with actor objects, taking advantage of databases
state management features.
2.3
      </p>
    </sec>
    <sec id="sec-4">
      <title>IoT Concept Drift</title>
      <p>
        Concept drift can be formally defined as follows [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]: given a
time period [0,  ], a set of samples, denoted as 0, = {0, ...,  },
where  = (,  ) is one observation or data instance,  is
the feature vector,  is the label, and 0, follows a certain
distribution 0, ( , ). Concept drift occurs at timestamp  + 1, if
0, ( , ) ≠ +1,∞ ( , ).
      </p>
      <p>Research on learning under concept drift presents three
components beyond traditional Training/Prediction: Drift detection,
drift understanding and drift adaptation. The first refers to whether
or not a concept drift occurs in a stream set of data. Drift
understanding is related to when, how and where it occurs. Finally, drift
adaptation refers to reacting to the existence of a drift.</p>
      <p>
        Recently, some works have been proposed to deal with concept
drift specifically in IoT platforms. For example, the work of [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
proposes an ensemble learning method based on ofline classifiers
to address concept drifts and imbalance data concurrently. In [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
its proposed an unsupervised model-independent methodology
to detect drifts in data generated from IoT devices. In [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], it is
proposed a concept drift adaptive method to anomaly detection in
IoT services that considers the time influence to change the sample
distribution. However, this is a not fully explored topic and many
research opportunities still exist.
3
      </p>
    </sec>
    <sec id="sec-5">
      <title>LEVERAGE ARRAY DATABASES TO IOT</title>
    </sec>
    <sec id="sec-6">
      <title>COMPLEX EVENT PROCESSING</title>
      <p>Historically, Database Management Systems have ofered many
benefits to data intensive applications, such as transactions,
indexing, query planning and declarative query languages. An IoT data
management solution must answer specific demands, such as
encapsulation for isolating state and access control, asynchronous
primitives and dynamic scalability, since in many scenarios,
sensing devices can instantly enter and leave a system. It should be
able to detect and react to predefined data patterns automatically,
while providing quick data access and an eficient integration to
ML analysis. Table 1 highlights the strong contributions ofered
by active, actor-oriented and array databases to each of these IoT
demands.</p>
      <p>Sensor</p>
      <p>Devices
Stream
Data</p>
      <p>Things
Layer</p>
      <p>Working
Storage</p>
      <p>Query
Processor
(continuous)
Concept
Drif/t
Detector
Event
Detector</p>
      <p>Staging
Data
Event
Processor
(Local)
Actors
Layer</p>
      <p>Storage</p>
      <p>Array
Continuous
Loader
Model</p>
      <p>Manager
Event Processor
(Global)</p>
      <p>Array
Data
Structures
Analysis
Layer
Array Data
Manager</p>
      <p>
        By taking our inspiration in the approaches of Orleans [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], that
added data-management functionality in a virtual actor runtime
and ReactDB [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], which integrates actor features into a relational
database system, we investigate the potential of performing event
detection and reactive behavior through actor-based primitives in
an array database model. Figure 1 illustrates the proposed idea. At
the things layer, data is collected from sensor devices and
communicated to actor engines at the actor layer. Distributed actors
manage these intermediate nodes that process and detect relevant
(local) events based on attached sensors before sending them to
the cloud based data center, along with relevant data in the form
of array data structures. At the analysis layer, global queries and
analysis that take into account alerts provided by actors can be
made over the collected data. The intention is to provide a low
latency environment, in which there is a reduced communication
bottleneck.
      </p>
      <p>The integration of ML-based analytics as part of the Data
Management System may lead to powerful optimization opportunities
since diferent parts of the ML process may be treated as operators
of the query plan. To cope with the growing need for ML support
in IoT data systems, we aim to provide both a local and a global
event detector that supports ML inference from trained models as
ifrst class operators.</p>
      <p>In IoT environments, communicated data from devices is
usually collected and recorded by assuming a temporal relationship
between records. As time goes on, concept drift is bound to occur,
which may cause an accuracy drop to any methods that rely on
long-term statistical data attributes. The proposed solution will
count with a central drift detector that is able to determine if and
when the drift occurred as well as the best reaction to it based on
the local drift detectors.
4</p>
    </sec>
    <sec id="sec-7">
      <title>CONCLUSION AND RESEARCH DIRECTION</title>
      <p>In this paper, we discuss characteristics and challenges of IoT data
management and summarize potential contributions from
diferent strategies in addressing each of them. Our goal is to build an
eficient, in-memory data management system that combines each
of these diferent contributions into a single integrated solution,
while ofering a robust support for data analysis trough Machine
Learning. As the next step in our study, we aim to focus on the
design refinement and implementation of a prototype system as
a foundation to our subsequent investigations. To evaluate the
viability of our approach, we intend to submit it to a real use-case
scenario that presents the IoT characteristics and challenges
described. We also intend to perform comparative experiments with
state-of-the-art big data frameworks in order to demonstrate the
optimization opportunities that we envision.
5</p>
    </sec>
    <sec id="sec-8">
      <title>ACKNOWLEDGEMENT</title>
      <p>We would like to thank CAPES for its scholarships, and Petrobras
for financing this work through the Gypscie project.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Furqan</given-names>
            <surname>Alam</surname>
          </string-name>
          , Rashid Mehmood, Iyad Katib, and
          <string-name>
            <given-names>Aiiad</given-names>
            <surname>Albeshri</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Analysis of eight data mining algorithms for smarter Internet of Things (IoT)</article-title>
          .
          <source>Procedia Computer Science</source>
          <volume>98</volume>
          (
          <year>2016</year>
          ),
          <fpage>437</fpage>
          -
          <lpage>442</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Mohsen</given-names>
            <surname>Asghari</surname>
          </string-name>
          , Daniel Sierra-Sosa,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Telahun</surname>
          </string-name>
          , Anup Kumar, and Adel S Elmaghraby.
          <year>2020</year>
          .
          <article-title>Aggregate density-based concept drift identification for dynamic sensor data models</article-title>
          .
          <source>Neural Computing and Applications</source>
          (
          <year>2020</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Baumann</surname>
          </string-name>
          , Andreas Dehmel, Paula Furtado, Roland Ritsch, and
          <string-name>
            <given-names>Norbert</given-names>
            <surname>Widmann</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>The multidimensional database system RasDaMan</article-title>
          .
          <source>In Proceedings of the 1998 ACM SIGMOD international conference on Management of data. Association for Computing Machinery</source>
          , Washington, USA,
          <fpage>575</fpage>
          -
          <lpage>577</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Baumann</surname>
          </string-name>
          , Dimitar Misev, Vlad Merticariu, and Bang Pham Huu.
          <year>2021</year>
          .
          <article-title>Array databases: concepts, standards, implementations</article-title>
          .
          <source>Journal of Big Data</source>
          <volume>8</volume>
          ,
          <issue>1</issue>
          (
          <year>2021</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>61</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Phil</given-names>
            <surname>Bernstein</surname>
          </string-name>
          , Sergey Bykov, Alan Geller, Gabriel Kliot, and
          <string-name>
            <given-names>Jorgen</given-names>
            <surname>Thelin</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Orleans: Distributed virtual actors for programmability and scalability</article-title>
          .
          <source>MSR-TR2014-41</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Philip</surname>
            <given-names>A Bernstein</given-names>
          </string-name>
          , Mohammad Dashti, Tim Kiefer, and
          <string-name>
            <given-names>David</given-names>
            <surname>Maier</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Indexing in an Actor-Oriented Database.</article-title>
          .
          <source>In CIDR.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Spyros</given-names>
            <surname>Blanas</surname>
          </string-name>
          , Kesheng Wu, Surendra Byna, Bin Dong, and
          <string-name>
            <given-names>Arie</given-names>
            <surname>Shoshani</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Parallel data analysis directly on scientific file formats</article-title>
          .
          <source>In Proceedings of the 2014 ACM SIGMOD international conference on Management of data. Association for Computing Machinery</source>
          , Utah, USA,
          <fpage>385</fpage>
          -
          <lpage>396</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Shaofeng</given-names>
            <surname>Cai</surname>
          </string-name>
          , Gang Chen, Beng Chin Ooi, and
          <string-name>
            <given-names>Jinyang</given-names>
            <surname>Gao</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Model slicing for supporting complex analytics with elastic inference cost and resource constraints</article-title>
          .
          <source>Proceedings of the VLDB Endowment 13</source>
          ,
          <issue>2</issue>
          (
          <year>2019</year>
          ),
          <fpage>86</fpage>
          -
          <lpage>99</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Min</given-names>
            <surname>Chen</surname>
          </string-name>
          , Shiwen Mao, Yin Zhang,
          <source>Victor CM Leung</source>
          , et al.
          <year>2014</year>
          .
          <article-title>Big data: related technologies, challenges and future prospects</article-title>
          . Vol.
          <volume>96</volume>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Gianpaolo</given-names>
            <surname>Cugola</surname>
          </string-name>
          and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Margara</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Processing flows of information: From data stream to complex event processing</article-title>
          .
          <source>ACM Computing Surveys (CSUR) 44</source>
          ,
          <issue>3</issue>
          (
          <year>2012</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>62</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Anderson</surname>
            <given-names>Chaves da Silva</given-names>
          </string-name>
          , Hermano Lourenço Souza Lustosa, Daniel Nascimento Ramos da Silva,
          <source>Fábio André Machado Porto, and Patrick Valduriez</source>
          .
          <year>2020</year>
          .
          <article-title>SAVIME: An Array DBMS for Simulation Analysis</article-title>
          and
          <string-name>
            <given-names>ML</given-names>
            <surname>Models</surname>
          </string-name>
          <article-title>Prediction</article-title>
          .
          <source>Journal of Information and Data Management</source>
          <volume>11</volume>
          ,
          <issue>3</issue>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Nikos</surname>
            <given-names>Giatrakos</given-names>
          </string-name>
          , Elias Alevizos, Alexander Artikis, Antonios Deligiannakis, and
          <string-name>
            <given-names>Minos</given-names>
            <surname>Garofalakis</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Complex event recognition in the big data era: a survey</article-title>
          .
          <source>The VLDB Journal 29</source>
          ,
          <issue>1</issue>
          (
          <year>2020</year>
          ),
          <fpage>313</fpage>
          -
          <lpage>352</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Ilya</given-names>
            <surname>Kolchinsky</surname>
          </string-name>
          and
          <string-name>
            <given-names>Assaf</given-names>
            <surname>Schuster</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Real-time multi-pattern detection over event streams</article-title>
          .
          <source>In Proceedings of the 2019 International Conference on Management of Data</source>
          .
          <volume>589</volume>
          -
          <fpage>606</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Chun-Cheng</surname>
            <given-names>Lin</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Der-Jiunn</surname>
            <given-names>Deng</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chin-Hung Kuo</surname>
            , and
            <given-names>Linnan</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Concept drift detection and adaption in big imbalance industrial IoT data using an ensemble learning method of ofline classifiers</article-title>
          .
          <source>IEEE Access</source>
          <volume>7</volume>
          (
          <year>2019</year>
          ),
          <fpage>56198</fpage>
          -
          <lpage>56207</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Jie</surname>
            <given-names>Lu</given-names>
          </string-name>
          , Anjin Liu, Fan Dong, Feng Gu, Joao Gama,
          <string-name>
            <given-names>and Guangquan</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Learning under concept drift: A review</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>31</volume>
          ,
          <issue>12</issue>
          (
          <year>2018</year>
          ),
          <fpage>2346</fpage>
          -
          <lpage>2363</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>David</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Luckham</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems</article-title>
          . Addison-Wesley Longman Publishing Co., Inc., USA.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Meng</surname>
            <given-names>Ma</given-names>
          </string-name>
          , Ping Wang, and
          <string-name>
            <surname>Chao-Hsien Chu</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Data management for internet of things: Challenges, approaches and opportunities</article-title>
          . In 2013 IEEE International conference
          <article-title>on green computing and communications and IEEE Internet of Things and IEEE cyber, physical and social computing</article-title>
          .
          <source>IEEE</source>
          ,
          <fpage>1144</fpage>
          -
          <lpage>1151</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Mohsen</surname>
            <given-names>Marjani</given-names>
          </string-name>
          , Fariza Nasaruddin, Abdullah Gani, Ahmad Karim, Ibrahim Abaker Targio Hashem, Aisha Siddiqa, and
          <string-name>
            <given-names>Ibrar</given-names>
            <surname>Yaqoob</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Big IoT data analytics: architecture, opportunities, and open research challenges</article-title>
          .
          <source>IEEE Access</source>
          <volume>5</volume>
          (
          <year>2017</year>
          ),
          <fpage>5247</fpage>
          -
          <lpage>5261</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Mehdi</surname>
            <given-names>Mohammadi</given-names>
          </string-name>
          , Ala Al-Fuqaha,
          <string-name>
            <given-names>Sameh</given-names>
            <surname>Sorour</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Mohsen</given-names>
            <surname>Guizani</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Deep learning for IoT big data and streaming analytics: A survey</article-title>
          .
          <source>IEEE Communications Surveys &amp; Tutorials</source>
          <volume>20</volume>
          ,
          <issue>4</issue>
          (
          <year>2018</year>
          ),
          <fpage>2923</fpage>
          -
          <lpage>2960</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>John</surname>
            <given-names>Paparrizos</given-names>
          </string-name>
          , Chunwei Liu, Bruno Barbarioli, Johnny Hwang, Ikraduya Edian,
          <string-name>
            <surname>Aaron J Elmore</surname>
          </string-name>
          ,
          <source>Michael J Franklin, and Sanjay Krishnan</source>
          .
          <year>2021</year>
          .
          <article-title>VergeDB: A Database for IoT Analytics on Edge Devices</article-title>
          .
          <source>In CIDR.</source>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>José</surname>
            <given-names>Roldán</given-names>
          </string-name>
          , Juan Boubeta-Puig, José Luis Martínez, and
          <string-name>
            <given-names>Guadalupe</given-names>
            <surname>Ortiz</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Integrating complex event processing and machine learning: An intelligent architecture for detecting IoT security attacks</article-title>
          .
          <source>Expert Systems with Applications</source>
          <volume>149</volume>
          (
          <year>2020</year>
          ),
          <fpage>113251</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Vivek</given-names>
            <surname>Shah</surname>
          </string-name>
          and Marcos Antonio Vaz Salles.
          <year>2018</year>
          .
          <article-title>Reactors: A case for predictable, virtualized actor database systems</article-title>
          .
          <source>In Proceedings of the 2018 International Conference on Management of Data</source>
          .
          <volume>259</volume>
          -
          <fpage>274</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Stonebraker</surname>
          </string-name>
          , Paul Brown, Donghui Zhang, and
          <string-name>
            <given-names>Jacek</given-names>
            <surname>Becla</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>SciDB: A database management system for applications with complex analytics</article-title>
          .
          <source>Computing in Science &amp; Engineering</source>
          <volume>15</volume>
          ,
          <issue>3</issue>
          (
          <year>2013</year>
          ),
          <fpage>54</fpage>
          -
          <lpage>62</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Villarroya</surname>
          </string-name>
          and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Baumann</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>On the Integration of Machine Learning and Array Databases</article-title>
          .
          <source>In 2020 IEEE 36th International Conference on Data Engineering (ICDE)</source>
          . IEEE,
          <fpage>1786</fpage>
          -
          <lpage>1789</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Yiwen</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Julio Cesar Dos Reis, Kasper Myrtue Borggren, Marcos Antonio Vaz Salles, Claudia Bauzer Medeiros, and
          <string-name>
            <given-names>Yongluan</given-names>
            <surname>Zhou</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Modeling and Building IoT Data Platforms with Actor-Oriented Databases.</article-title>
          .
          <source>In EDBT</source>
          .
          <volume>512</volume>
          -
          <fpage>523</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Jennifer</given-names>
            <surname>Widom</surname>
          </string-name>
          and
          <string-name>
            <given-names>Stefano</given-names>
            <surname>Ceri</surname>
          </string-name>
          .
          <year>1996</year>
          .
          <article-title>Active database systems: Triggers and rules for advanced database processing</article-title>
          . Morgan Kaufmann.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Rongbin</surname>
            <given-names>Xu</given-names>
          </string-name>
          , Yongliang Cheng, Zhiqiang Liu, Ying Xie, and
          <string-name>
            <given-names>Yun</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Improved Long Short-Term Memory based anomaly detection with concept drift adaptive method for supporting IoT services</article-title>
          .
          <source>Future Generation Computer Systems</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>