=Paper=
{{Paper
|id=Vol-1670/paper-27
|storemode=property
|title=From Cloud to Fog and Sunny Sensors
|pdfUrl=https://ceur-ws.org/Vol-1670/paper-27.pdf
|volume=Vol-1670
|authors=Hannes Grunert,Martin Kasparick,Björn Butzin,Andreas Heuer,Dirk Timmermann
|dblpUrl=https://dblp.org/rec/conf/lwa/GrunertKBHT16
}}
==From Cloud to Fog and Sunny Sensors==
<pdf width="1500px">https://ceur-ws.org/Vol-1670/paper-27.pdf</pdf>
<pre>
            From Cloud to Fog and Sunny Sensors
                                    Position Paper

    Hannes Grunert1 , Björn Butzin2 , Martin Kasparick2 , Andreas Heuer1 , and
                               Dirk Timmermann2
     1
         University of Rostock, Database Research Group, 18051 Rostock, Germany
     2
         University of Rostock, Institute of Applied Microelectronics and Computer
                          Engineering, 18051 Rostock, Germany


          Abstract. Assistive systems collect large amounts of data in the inter-
          net of things and compute behavior and intentions of users in the cloud.
          Our approach is to push these computations (interpreted as database
          queries) as close as possible to the local sensors of the internet of things.
          We aim at replacing privacy-compromising cloud-based computations by
          fog- or edge-based computations or even by processing on the local sen-
          sors directly. Not only can this approach solve privacy problems, but also
          results in a better performance and energy-efficiency of the whole sys-
          tem: sensor-based computations are the (privacy-respecting) sunny side
          of the cloud. This position paper will give a short motivation and state
          of the art in different areas (from databases to wireless sensor networks)
          and will present our approach in combining modern interfaces to differ-
          ent sensors and concepts of database theory such as query rewriting and
          query containment.

          Keywords: Cloud, Fog and Edge Computing, Internet of Things, Sen-
          sor Networks, Privacy, Performance, Query Rewriting, Query Contain-
          ment


1        Motivation

Assistive systems support the users at work (Ambient Assisted Working) while
they can remotely controls their homes (Ambient Assisted Living, AAL). Through
various sensors, information about the current situations and the actions of the
users are collected. Thie data is stored by the system and linked with other
data from the web, for example the Facebook profiles of the users. By designing
models for intention and activity recognition from the connected data, the smart
environment can react autonomously to meet the needs of the users.
    In assistive systems [12], significantly more information than required is col-
lected in the cloud – which raises questions about privacy. The users usually have
no or only a very small influence on the storage and processing of their personal
data. If the cloud service is not located in their native country, the users cannot
be sure that the same laws apply as in their home countries. As a result, their
right to informational self-determination is violated.
    The introduction of data privacy mechanisms in assistive systems is seen
very skeptical by the developers. It is feared that the anonymization of the
data hinders system development. Anonymization or pseudonymization of the
data may lead to loss of detail, so that the results of analytic processes become
inaccurate and, in extreme cases, unusable.
    Our idea is to support assistive systems in performing the necessary behavior
and intention recognition algorithms, but to automatically push the analysis
operations as far as possible from the cloud to the (local) sensors. In an AAL
environment, this results in a sensor-based or fog-based (edge-based) instead of
a cloud-based computation. Besides privacy aspects, sensor-based or fog-based
computing can also increase the performance and energy-efficiency of the system.

1.1   Cyber-physical systems and wireless sensor networks
The often cited law of Gordon Moore is used many times to argue that we
do not need to increase efficiency, one just has to wait for the next hardware
generation and thus a new boost of computational power. In wireless sensor
networks (WSN), cyber-physical systems (CPS) and the internet of things (IoT),
considerations are different. Here, the main driving factor is the reduction in
energy consumption. Even newly developed moting platforms like UC Berkeley’s
Firestorm only have 512kB ROM and 64kB of RAM [1] which is not significantly
more than in devices of 2005 in terms of absolute numbers, e.g., TelosB: 48kB
ROM and 10kB RAM. Instead, the idle power consumption has been reduced,
in this example by about 55 percent from 5.1 µA to 2.3 µA. Hence, the focus on
developing them is different. To save as much energy as possible, moting devices
are kept asleep as long as possible. If awake, the transmission of data is the most
costly operation in WSN, thus, it is tried to avoid them. Data should only be
sent when requested by others or local memory is going to exceed. Additionally,
through aggregation, preprocessing and compression, the time to send the next
data-set can be extended. At this point computational power as well as memory
capacities have to be utilized in an efficient way. After reducing the amount and
frequency of transmissions required, the next level is to keep the middleware and
transmission protocol energy-efficient. This means to reduce protocol overhead
but also stack sizes and dynamic memory consumption need to be taken into
account.
    The idea of privacy through locality and energy saving constraints of wire-
less sensor networks harmonize with each other. However, other requirements of
databases might be contradictory to the requirements of WSN e.g. in terms of
latency, reliability, and consistency.

1.2   Privacy and Performance
By reducing and pre-aggregating raw sensor data to its minimal essence of re-
quired information, it is possible to protect privacy in smart systems. Addition-
ally, the overall performance of the system can be enhanced when less data is
analyzed on frequently busy nodes.
    As part of our research project, privacy concepts for processing queries in
assistive systems are designed. However, these concepts are not placed on top
of the existing analysis functions, but are integrated in close cooperation during
the development process.
    The data-avoidant passing of the information regarding sensors and con-
text to the analytical tools of the assistive system will not only improve the
privacy-friendliness of the system. By pre-compressing the data by means of se-
lection, aggregation, and compression operators on the sensor itself it is possible
to increase the efficiency of the system. The privacy rights and the information
requirements of the analysis tools can be implemented as integrity constraints
in the database system. Through the integrity constraints, the necessary algo-
rithms for anonymization and preprocessing can be run directly on the database.
Thus, a transfer of the local data to external programs or modules, that might
be located on different computing units, is omitted.
    Instead of using hundreds of thousands of computers in the cloud (e.g. Google
or Amazon), we can also use hundreds of billions of sensors or devices in the IoT
to perform the necessary computations for the behavior and intention recognition
of assistive systems. This results in fog or edge computing [11, 4] and even in
local data processing on sensors.


2   State of the Art

We now give a short overview of the research areas cloud, Big Data, IoT (espe-
cially middleware and embedded database systems) and fog computing.


Cloud and Big Data: In the era of Big Data, more and more information
is stored and processed in Cloud environments like IBM’s Bluemix and simi-
lar platforms. Such systems offer a variety of services and possibilities for data
storage, including services for the Internet of Things (e.g. APIs for REST and
MQTT).
    Unfortunately, privacy is often ignored or, at least, it is not guaranteed by
cloud services. For example, nearly every service offered by IBM Bluemix states
in the Terms of Use that the service “[...] does not comply with the US-EU [...]
Safe Harbor Frameworks” (e.g. the Watson service “Driver Behavior” [7]).


Internet of Things — Middleware: IoT, CPS and WSN are distributed net-
works of small and heterogeneous applications. The service oriented architecture
(SOA) approach has shown to be useful in such environments. SOA is well known
for its capability to integrate different applications horizontally and vertically.
Due to the restrictions of the devices in such scenarios, SOA approaches have
been tailored to fit the needs for reduced overhead, descending memory footprint
and less required computational power. Examples of those are the constrained
application protocol (CoAP) and the devices profile for web services (DPWS).
CoAP is a promising candidate for IoT applications, as it is a RESTful SOA
type with less overhead than HTTP and adds publish subscribe mechanisms.
A comparison of different CoAP implementations can be found in [2]. Another
competing protocol for networked embedded devices is Message Queue Teleme-
try Transport (MQTT). It is a pure publish subscribe protocol, that uses a
centralized broker to manage the message flow. Often the cloud is used as bro-
ker. Thus, due to its centralized nature, MQTT is less optimal than CoAP for
our proposed solution.
    Data publishing of “traditional” database systems is performed on estab-
lished interfaces like JDBC or ODBC. By these standardized interfaces, the
programmer does not have to care about the actual implementation and can
write code independent of the actual database system.


Embedded Databases: Besides standard database systems there exist several
specialized databases like Berkeley DB [10] and TinyDB [9]. These systems are
designed to run on resource limited devices like Raspberry Pis or even as embed-
ded databases directly on the sensor. In [5], several approaches to a distributed
database management on sensor networks are compared, TinyDB among them,
here especially aiming at energy efficiency. Acquisitional query processing [5, 9]
can push queries to sensors and select relevant sensors in a WSN to reduce the
amount of sensors needed for a computation (sensor reduction). In the existing
approaches mentioned in [5] , this sensor reduction is completely done manually
(by the programmer).


Fog Computing: What is missing in fog computing, is a database-centered
approach to computation, that is, given a query respresenting a necessary com-
putation on the sensor data of the IoT, how to automatically prevent to simply
transfer the complete data sets to the cloud servers.
    In [6] we introduced a framework for privacy aware query processing in lay-
ered networks of “traditional” databases. The query processor includes modules
for query rewriting and transformation, detecting key-like combinations of at-
tributes and different anonymization concepts like k-anonymity and slicing,which
can be extended by including new concepts for querying modern hardware,e.g.
by rewriting a SQL query to different data management layers, the lowest of
these being only able to perform some simple filter operations (like selections
against given attribute values).


3   Vision

We propose a layered architecture with four logically distinguishable layers (see
Fig. 1). The Sensor Layer includes the sensors which are very resource con-
strained in terms of CPU, memory, and power. The Personal Layer consists of
typically mobile devices or embedded systems with higher performance but also
high power constraints, like smartphones or edge nodes of a WSN. Router, home
automation control units, private servers, etc. build up the Fog Layer. As these
                                                              Cloud      complex analysis
                                                              Layer        in R and SQL


                                                                           complex SQL
                                                              Fog          queries with
                                                              Layer          recursion


                                                              Personal    simple queries
                                                              Layer      with aggregation


                                                              Sensor       simple filter
                                                              Layer         operations


                         Fig. 1. Layered System Approach


devices have a wired power supply power saving is not as relevant as for the
two lower layers. The Cloud Layer is built by powerful server farms without no-
table constraints according to power, CPU, or memory. Note that it is possible
to have multiple layers within one of the four major layers, e.g. there could be
several Fog Layers within a multi-tenancy or office building. From the top to
the bottom layer resource constraints are increasing and the amount of possible
(database related) functionalities and operations decreases. This layer approach
has several advantages: In terms of privacy, each layer defines a strict transition
where it can be defined which data is passed upwards and its granularity. This
allows the fine-grained protection of critical personal data like health data, as
the information can be stored and processed within the local parts of the system.
Generally, the lower the layer, the higher is the ability of the user to control its
own data.
    As lower layers are more resource constrained than the upper ones, the middle
layers provide functionalities for data processing as well as data transmission
and additionally proxy functionalities. This enables optimized query execution
according to the given resource constraints. The proxy functionality allows a
reduction of the amount of communication. Especially within the Sensor Layer
and between Sensor and Personal Layer this is a major aspect of power efficiency.
The layered approach enables the power efficiency optimization of the overall
system and not only for local nodes.
    The constrained set of database operations at the lower layers can be com-
pensated efficiently by the vertical fragmentation of queries into pushed-down
and remainder queries. Thus, the privacy constraints of the users of these smart
environments such as assistive systems are supported.
    We apply our concept on machine learning algorithms to show how they
will be transformed and pushed down in several steps resulting in simple filter
operations.
    A remaining open problem is to decide whether such queries can be performed
on a resource-constrained device. If not, we have to check if the data can be sent
to an upper layer without violating the privacy constraints of the users. This
open problem results in a query containment problem that will be part of our
future research.
     Currently, only simple algorithms can be split up into their basic functions.
The transformation of complex queries into simple fragments should be done
automatically. By rewriting the complex query Q into Qj and Qδ , where Qδ is
executed outside of the protected system environment, we hope to only transfer
data to the cloud that do not compromise privacy. We use extensions of the
theory of query containment and query optimization for conjunctive queries
[3, 8] to consider more complex queries (including complex statistical functions
using aggregation and grouping) [6]. This approach can be extended to an IoT
scenario with multiple layers, where the top layer is a cloud system while the
bottom layer consists of embedded hardware.
     The handling of data in IoT environments will be rethought fundamentally.
Currently data is just pushed to the cloud while the layered approach enables
new methods to store, process, and query data on the lower layers. To achieve
this, IoT and database middleware have to be collated.

References
 1. Andersen, M.P., Fierro, G., Culler, D.E.: System design for a synergistic, low power
    mote/BLE embedded platform. In: 2016 15th ACM/IEEE International Confer-
    ence on Information Processing in Sensor Networks (IPSN). pp. 1–12
 2. Butzin, B., Konieczek, B., Fiehe, C., Golatowski, F.: Applying the BaaS reference
    architecture on different classes of devices. In: 2nd International Workshop on
    Modelling, Analysis, and Control of Complex CPS (CPS Data). pp. 1–6 (Apr
    2016), to be published
 3. Chirkova, R.: Query containment. In: Encyclopedia of Database Systems, pp. 2249–
    2253. Springer US (2009)
 4. Dastjerdi, A.V., Gupta, H., Calheiros, R.N., Ghosh, S.K., Buyya, R.: Fog comput-
    ing: Principles, architectures, and applications. CoRR abs/1601.02752 (2016)
 5. Diallo, O., Rodrigues, J.J.P.C., Sene, M., Mauri, J.L.: Distributed database man-
    agement techniques for wireless sensor networks. IEEE Trans. Parallel Distrib.
    Syst. 26(2), 604–620 (2015)
 6. Grunert, H., Heuer, A.: Datenschutz im PArADISE. Datenbank-Spektrum 16(2),
    107–107 (July 2016)
 7. IBM: IBM Watson IoT Driver Behavior Service. http://www-03.ibm.com/
    software/sla/sladb.nsf/sla/bm-7328-01?Open, last access: 09.06.2016
 8. Kolaitis, P.G., Vardi, M.Y.: Conjunctive-Query Containment and Constraint Sat-
    isfaction. 17. Symposium on Principles of Database Systems, Seattle pp. 205–213
    (1998)
 9. Madden, S.R., Franklin, M.J., Hellerstein, J.M., Hong, W.: TinyDB: an acquisi-
    tional query processing system for sensor networks. ACM Transactions on Database
    Systems (TODS) 30(1), 122–173 (2005)
10. Oracle: Oracle Berkeley DB 12c. http://www.oracle.com/technetwork/
    database/database-technologies/berkeleydb/overview/index.html,                   last
    access: 09.06.2016
11. Shi, W., Dustdar, S.: The Promise of Edge Computing. Computer 49(5), 78–81
    (May 2016)
12. Weiser, M.: The Computer for the 21st Century. Scientific American 265, 94–104
    (1991)

</pre>