=Paper= {{Paper |id=Vol-2045/Bilisim_paper_4 |storemode=property |title= Relational & Non-Relational Database Management Systems Performance Comparison |pdfUrl=https://ceur-ws.org/Vol-2045/11_Bilisim_2017_paper_4.pdf |volume=Vol-2045 |authors=Gizem Kiraz,Cengiz Togay }} == Relational & Non-Relational Database Management Systems Performance Comparison== https://ceur-ws.org/Vol-2045/11_Bilisim_2017_paper_4.pdf
IoT Data Storage: Relational & Non-Relational Database
    Management Systems Performance Comparison
                  Gizem Kiraz                                                       Cengiz Toğay
              Computer Engineering                                              Computer Engineering
                Uludag University                                                 Uludag University
                 Gorukle, Bursa                                                     Gorukle, Bursa
           501631002@ogr.uludag.edu.tr                                          ctogay@ uludag.edu.tr


 ABSTRACT                                                       Big data is a term that describes the large volume of data
 Internet of Things (IoT) becomes recently a popular            –both structured and unstructured. The DBMSs basically
 research topic and market reality. According to several        can be separated into relational and non-relational DBMS.
 research companies, in 2025, up to 75 billion devices are      The relational DBMS stores the data rows and columns in
 estimated to connect internet and generate an enormous         tables with a high data consistency. The most commonly
 number of data. This increases in data cause several           used open source relational DBMS is MySQL. Non-
 difficulties such as the storage cost and processing of such   relational databases (NOSQL) have arisen as an
 large data. In this paper, we have been studied on             alternative to relational databases. The aim of the NOSQL
 performance comparison of relational (MySQL) and non-          is often not to give guaranty the Atomicity, Consistency,
 relational (MongoDB) database management systems for           Isolation, and Durability (ACID). The NOSQL does not
 storing and processing of this large IoT data. Both types      depend on constant table definitions and rigid schemas.
 of database management systems have been tested.               Columns or records can be added to the collection at any
 According to comparison of experimental results, the non-      time without exclusive process. Therefore; the number of
 relational database management systems, which we               records in the columns does not have to equal with each
 studied and searched, have provided better performance         other. Data sets in IoT environment can change after setup
 for storing and processing of large data.                      of the system, so this environment requires a flexible data
                                                                storage system. There are four different storage formats in
 Keywords                                                       NOSQL namely key-value, columns, document-based
 Internet of Things; MySQL; MongoDB; RDBMS;                     and graff-based.
 NRDBMS.
                                                                It has been investigated a document-based storage format
 INTRODUCTION                                                   in NOSQL. In such a system, a record is called document
 The Internet of Things (IoT) is a self-configuring and         and these documents are usually stored in JSON format
 adaptive system that consist of sensor networks and smart      [3]. There are various implementation of the NOSQL
 objects whose aim is to interconnect all devices/sensors in    DBMS such as MongoDB [4] [5], CouchDB [6], HBase
 daily life [1]. According to the projections of many           [7], Cassandra [8], Amazon SimpleDB [9], and Redis
 organizations and companies, up to 75 billion devices will     [10]. Since MongoDB is open source and commonly used,
 interconnect using the internet and several challenges and     it has been chosen MongoDB in this study. There are no
 issues that need to be addressed will raise. Therefore, IoT    database schemas or tables in MongoDB. MongoDB uses
 becomes one of the most popular research topic recent          “collection” instead of a table, and “document” instead of
 years. Moreover, IoT is closely related to big data and        rows to store data. Furthermore, MongoDB uses two
 cloud technology. Big data is produced by the different        different operations instead of the join operation. These
 types of the applications such as industrial processes,        are nesting documents inside each other and to store a
 medical devices, embedded control systems, gateways,           reference to the other document rather than nesting entire
 and GPS sensors etc. This means that an amount of data         document.
 worldwide increases day by day. In 2016, more than 5.5         There are many studies about comparing the performance
 million connected devices are inserted every day, and it is    of databases [11] [12] [13] [14]. These studies vary
 expected that number of devices more than 20.8 billion         depending on the data size, the variety of data, the
 worldwide by 2020 [2]. Sensors are also produced data          differences in databases used, implementation languages,
 and they are important for big data growth. The sensor         and subjects of the projects. In the study [15], MongoDB,
 data is the most popular data type between IoT                 MySQL, CouchDB, and Redis are compared. It is
 applications.                                                  declared that MongoDB is performing better among the
                                                                comparative database management systems in terms of
the "bulk insert" writing performance. However, MySQL
and MongoDB have similar performance results for
reading operations. Performance parameters between
these DBMSs can be negligible (typically less than 1
second) [15]. However, our test results show that
MongoDB has better performance than MySQL in terms
of reading and writing as represented in “Results of
Experiments” section. MongoDB is utilized for to store
GPS sensor data and to communicate with the analysis
tools such as Apache Mahout [11]. ACID operations on                           Figure 2.Test Environment
MongoDB and MySQL DBMSs are also applied to
compare them [12] [13]. According to the results, the use      writing transactions. In our platform, our target
of the MongoDB has been encouraged for large data              throughput is forty messages/milliseconds in average for
applications, especially for applications of big data [12].    writing. Our target throughput and test results compared
In [14], MongoDB, Raven, CouchDB, Cassandra,                   with “Results of Experiments” section.
HyperTable, CouchBase, and SQL DBMSs are compared              The rest of the article is organized as follows. “Test
in terms of the read, write, delete, and instantiate           Environment & Methodology” section consists of the
operations. According to results of this study[14], all keys   information about the environment of the experiments and
are needed to fetch, MongoDB has better performance            methodology. “Results of Experiments” section presents
than the others.                                               the results and graphics from the experiments.
The aim of the experiments is a comparison of the              “Conclusions” section summarizes and concludes the
relational and non-relational DBMSs for utilization an IoT     experiments and gives a recommendation for future of this
platform. The system includes IoT devices which publish        study.
a tremendous number of sensor data where servers store
and process them. Performance of reading and writing           TEST ENVIRONMENT & METHODOLOGY
tests has been done in both MYSQL and MongoDB in this          In this paper, research is conducted on the relational and
study. The test results which are calculated after the         non-relational databases. Server and clients of chosen
application runs at least three times are compared to find     DBMSs to be used in our experiments are set up
out where we can store data of IoT considering lowest cost     separately in virtual machines with Ubuntu 16.04 Server
in terms of throughput.                                        version. Virtual machines are hosted on a physical
                                                               machine (i7 6700HQ, 16 GB DDR3 RAM, and SSD disc).
IoT platform is defined which collects, and processes          The virtual machines (4 CPU cores, 4 GB RAM) are
sensor data as seen in Figure 1. The sensor data is            executed on the physical servers as depicted in Figure 2.
produced by devices/clients and collected by Data              A dedicated network is a setup among the servers.
Storage Server (DSS) in server side. The data is stored in     Therefore, it is guaranteed that another network traffıc is
a DBMS through insert and update operations. Data              not disrupted the tests. Both DBMSs are installed on the
Query Server (DQS) in the platform provides an interface       computer with SSD for the fastest possible read and write
for processing and reporting by Client Application. The        speed and they are executed separately during the test
client application sends the data request to the DQS and       scenarios. Multithreaded Java applications for reading and
they are formed as a query for DBMS. The result of the         writing operations on DBMSs are implemented.
query is delivered to the client application. In our case,
enormous data should be stored in the DBMS. Therefore,         Experiments’ constraints are the number of machines,
writing operations are more important than the reading         number of threads, number of messages, and the size of
operations. The platform has active-active architecture.       the string. These constraints are applied for both DBMSs.
Therefore, more than one DSSs can handle the data              The application is executed on a virtual machine;
                                                               therefore, the applications are limited in terms of the CPU
                                                               core and memory. In this study, since writing operations
                                                               are more important than reading operations, our tests
                                                               concentrate on the writing operations. In our tests, data
                                                               examples are selected as similar to real-time sensor
                                                               applications.
                                                               In this study, two columns are defined including variable-
                                                               length string type, an integer type. A primary key column
                                                               is automatically defined in MySQL, but it needs to be
                                                               defined in MongoDB. The execution time of the tests is
                                                               calculated in milliseconds. The number of messages is
    Figure 1.The IoT Platform Architectural Structure          measured dividing the total number of messages into the
                                                               experiment’s execution time.
                                                        MYSQL                             MySQL; forty threads for one computer and twenty
                            20,00
                                                                                          threads for two computers and also for MongoDB; eighty
  NUMBER OF MESSAGES/TIME   15,00
                                                                                          threads for one computer, forty threads for two computers,
                                                                                          and twenty threads for three computers. It can be seen that
                            10,00                                                         MongoDB supports more than sixty threads for best
                                                                                          throughput.
                             5,00
                                                                                                                                                                   MYSQL




                                                                                          NUMBER OF MESSAGES / TIME
                                                                                                                              20,00
                             0,00
                                    500000 1000000 1500000 2000000 4000000
                                                    NUMBER OF MESSAGES
                                      1-10             1-20          1-40       1-80                                          10,00

                                      2-10             2-20          2-40       2-80



                                                                                                                                      0,00
               Figure 3.Result Graph of Write Operations on MySQL                                                                             500000       1000000 1500000 2000000 4000000
                                                                                                                                                               NUMBER OF MESSAGES
                                                                                                                                                2-10-100               2-20-100                2-40-100
RESULTS OF EXPERIMENTS                                                                                                                          2-10-1000              2-20-1000               2-40-1000
Insert tests are executed with a multithreaded Java                                                                                             2-10-2000              2-20-2000               2-40-2000

application. The application sends insert SQL request
which contains a string (100 characters) and an integer                                   Figure 5.Results of String Length Experiments on MySQL
value to DBMSs. The application is executed in
                                                                                          It is also tested the data with different variable string
computers based on the parameters (number of computer
                                                                                          length as depicted in Figure 5 and Figure 6. Since best
and threads) as depicted in Figure 3 and Figure 4. Best
                                                                                          results are obtained from the two computers, only two
throughput is 18.21 messages/millisecond in average for
                                                                                          computers cases are tested in these tests. As it can be seen
MySQL such that is succeeded with two computers each
                                                                                          that 18.21, 9.33, and 6.13 messages/millisecond in
has twenty threads as presented in Figure 3. As it can be
                                                                                          average are obtained for MySQL with 100, 1000, and
seen that forty threads for one and two computers
                                                                                          2000 string length respectively as depicted in Figure 5.
utilization have close results. Best throughput will go up
                                                                                          Similarly, previous results, twenty threads utilization for
when the number of threads increases. However, when the
                                                                                          MySQL has best throughput results. For MongoDB, forty
number of threads reaches eighty, throughput value begins
                                                                                          threads utilization has best throughput results; 70.95,
to decrease. Therefore, it has been decided that MySQL
                                                                                          49.04, and 39.85 messages/millisecond in average are
can manage forty threads for best results.
                                                                                          obtained with 100, 1000, and string (2000 characters)
Similarly, best throughput is 70.95 messages/millisecond                                  respectively as depicted in Figure 6. MongoDB DBMS
in average for MongoDB such that is succeeded with two                                    has about four times better results than MYSQL. As it can
computers each has forty threads as presented in Figure 4.                                be seen that length of the message is one of the most
To eliminate the effects of the thread switching, the third                               important parameters. Such as when the length is doubled,
computer is also used for MongoDB. Throughput for                                         throughput is decreased about twenty percent.
utilization of the one, two and three computers is 61.22,
                                                                                          Select tests are also executed with the same multithreaded
70.95, and 61.65 messages/millisecond, respectively. As
                                                                                          Java application. The application retrieves data from two
it can be seen that two computers’ utilization has the best
                                                                                          DBMSs. The application is executed on a computer and
performance. Another result can be obtained from the
                                                                                          two computers with a different number of the threads as
Figure. 3 and Figure. 4 is a correlation between a number
                                                                                          depicted in Figure 7 and Figure 8. MySQL results are not
of threads and computers. Best results are obtained for

                            80,00                     MONGODB                                                                         80,00                     MONGODB
                                                                                                          NUMBER OF MESSAGES / TIME
 NUMBER OF MESSAGES/TIME




                            60,00
                                                                                                                                      60,00

                            40,00
                                                                                                                                      40,00

                            20,00
                                                                                                                                      20,00
                             0,00
                                    500.000 1.000.0001.500.0002.000.0004.000.000                                                       0,00
                                                 NUMBER OF MESSAGES                                                                            500000       1000000 1500000 2000000 4000000
                                             1-10             1-20       1-40      1-80                                                                         NUMBER OF MESSAGES
                                                                                                                                               2-10-100         2-20-100           2-40-100           2-80-100
                                             2-10             2-20       2-40      2-80
                                             3-10             3-20       3-40      3-80                                                        2-10-1000        2-20-1000          2-40-1000          2-80-1000
                                                                                                                                               2-10-2000        2-20-2000          2-40-2000          2-80-2000
                                                                                                                                       Figure 6.Results of String Length Experiments on
 Figure 4.Result Graph of Write Operations on MongoDB
                                                                                                                                                           MongoDB
                                                                                                                           The DQS in the IoT platform applies reading operations
                                  80,00                              MYSQL                                                 on the DBMS. Test results show that MongoDB has better
 NUMBER OF MESSAGES/TIME
                                                                                                                           throughput than the MySQL. MongoDB has 55.07
                                  60,00
                                                                                                                           messages/millisecond in average and MySQL has 46.66
                                  40,00
                                                                                                                           messages/millisecond in average for two computers.
                                                                                                                           Therefore, MongoDB is selected as DBMS for writing
                                  20,00                                                                                    and reading operations in the IoT Platform.

                                   0,00                                                                                    ACKNOWLEDGEMENT
                                            1 0 0 . 0 0 0 2 5 0 . 0 0 0 5 0 0 . 0 0 0 1 . 0 0 0 . 0 0 02 . 0 0 0 . 0 0 0   These results preliminary study of the project proposal
                                                             NUMBER OF MESSAGES
                                          1-10             1-15              1-20              1-40              1-80      applied to TUBITAK 1505 University Industry
                                          2-10             2-15              2-20              2-40                        Collaboration Grant Program and the study is supported
                                                                                                                           by EMKO Electronic A.Ş located in Bursa, Turkey.
               Figure 7.Result Graph of Read Operations on MySQL
                                                                                                                           REFERENCES
stable as MongoDB. Best throughput is 60.09                                                                                [1]    “IoT.”                  [Online].                 Available:
messages/millisecond and 44.34 in average for MySQL                                                                               https://connectedtechnbiz.wordpress.com/tag/internet-of-
                                                                                                                                  things/.
with one computer has eighty threads. In MongoDB, the
best result is 68.68 messages/millisecond and 58.61                                                                        [2]    “Gartner Says 6.4 Billion Connected ‘Things’ Will Be in Use
messages/millisecond in average is succeeded with three                                                                           in 2016, Up 30 Percent From 2015,” 2015. [Online].
                                                                                                                                  Available: http://www.gartner.com/newsroom/id/3165317.
computers each has eighty threads as depicted in Figure 8.
Furthermore, it has been concluded that MongoDB can                                                                        [3]    “JSON.” [Online]. Available: http://json.org/.
manage eighty threads for reading operations.                                                                              [4]    “MongoDB                                       Documentation,”
                                                                                                                                  https://docs.mongodb.com/manual/. .
CONCLUSIONS                                                                                                                [5]    K. Chodorow, Mongo DB: The Definitive Guide. 2013.
Our       IoT     platform      requires    that    forty
messages/milliseconds in average should be written to the                                                                  [6]    “CouchDB.” [Online]. Available: http://couchdb.apache.org/.
chosen DBMS. Otherwise, the number of messages                                                                             [7]    “HBASE.” [Online]. Available: https://hbase.apache.org/.
waiting in the queue for writing will increase and it can                                                                  [8]    “Cassandra.”                [Online].               Available:
cause memory problems. In the IoT platform, active-                                                                               http://cassandra.apache.org/.
active architecture is applied. Therefore, more than one                                                                   [9]    “Amazon         SimpleDB.”        [Online].         Available:
computer can write to DBMS at the same time. Test                                                                                 https://aws.amazon.com/simpledb/.
results show that MongoDB has better performance than
                                                                                                                           [10]   “Redis.” [Online]. Available: https://redis.io/.
the MYSQL in terms of both writing and reading
operations. In the IoT Platform, the message payload is                                                                    [11]   G. Aydin, I. R. Hallac, and B. Karakus, “Architecture and
varying between 100 bytes and 200 bytes. For these types                                                                          implementation of a scalable sensor data storage and analysis
                                                                                                                                  system using cloud computing and big data technologies,” J.
of messages, MongoDB has 70.95 messages/milliseconds                                                                              Sensors, vol. 2015, 2015.
in average and MySQL has 18.21 messages/milliseconds
                                                                                                                           [12]   S. Chickerur, A. Goudar, and A. Kinnerkar, “Comparison of
in average for writing. As it can be seen that only                                                                               Relational Database with Document-Oriented Database
MongoDB satisfy the target expectations 40                                                                                        (MongoDB) for Big Data Applications,” Proc. - 8th Int. Conf.
messages/milliseconds in average. MongoDB also can                                                                                Adv. Softw. Eng. Its Appl. ASEA 2015, pp. 41–47, 2016.
satisfy the requirement with a single machine which has                                                                    [13]   Z. Parker, S. Poe, and S. V. Vrbsky, “Comparing NoSQL
61.22 messages/milliseconds throughput for writing.                                                                               MongoDB to an SQL DB,” Proc. 51st ACM Southeast Conf. -
                                                                                                                                  ACMSE ’13, p. 1, 2013.
                                80,00                             MONGODB                                                  [14]   Y. Li and S. Manoharan, “A performance comparison of SQL
    NUMBER OF MESSAGES / TIME




                                                                                                                                  and NoSQL databases,” IEEE Pacific RIM Conf. Commun.
                                                                                                                                  Comput. Signal Process. - Proc., no. August 2013, pp. 15–19,
                                60,00                                                                                             2013.
                                                                                                                           [15]   P. T. A. Mai, J. K. Nurminen, and M. Di Francesco, “Cloud
                                40,00                                                                                             databases for internet-of-things data,” Proc. - 2014 IEEE Int.
                                                                                                                                  Conf. Internet Things, iThings 2014, 2014 IEEE Int. Conf.
                                                                                                                                  Green Comput. Commun. GreenCom 2014 2014 IEEE Int.
                                20,00                                                                                             Conf. Cyber-Physical-Social Comput. CPS 20, no. iThings,
                                          100.000         250.000         500.000       1.000.000 2.000.000                       pp. 117–124, 2014.
                                                              NUMBER OF MESSAGES
                                            1-10             1-15             1-20             1-40              1-80
                                            2-10             2-15             2-20             2-40              2-80
                                            3-10             3-15             3-20             3-40              3-80



 Figure 8.Result Graph of Read Operations on MongoDB
BIOGRAPHY(S)
Gizem Kiraz                                         Cengiz Toğay
               Gizem Kiraz has completed her                           Cengiz Togay, Ph.D. is an assistant
               undergraduate (2016) in Computer                        professor at Uludag University's
               Engineering Department from the                         Computer Engineering Department.
               Pamukkale University (PAÜ). Her                          He obtained his undergraduate (1999)
               postgraduate has started (2017) in                       and MS (2001) in computer
               Computer Engineering Department                          engineering from the Canakkale
               from the Uludag University.                              Onsekiz Mart University and his PhD
               Currently, she is studying on the                        (2008) in computer engineering
               Internet of Things.                                      department from Middle East
                                                    Technical University. He has national and international
                                                    patent applications, papers, articles and projects about
                                                    Software Engineering, Secure Communications, Smart
                                                    Cards, and Internet of Things(IoT).