=Paper=
{{Paper
|id=Vol-2045/Bilisim_paper_4
|storemode=property
|title= Relational & Non-Relational Database Management Systems Performance Comparison
|pdfUrl=https://ceur-ws.org/Vol-2045/11_Bilisim_2017_paper_4.pdf
|volume=Vol-2045
|authors=Gizem Kiraz,Cengiz Togay
}}
== Relational & Non-Relational Database Management Systems Performance Comparison==
IoT Data Storage: Relational & Non-Relational Database
Management Systems Performance Comparison
Gizem Kiraz Cengiz Toğay
Computer Engineering Computer Engineering
Uludag University Uludag University
Gorukle, Bursa Gorukle, Bursa
501631002@ogr.uludag.edu.tr ctogay@ uludag.edu.tr
ABSTRACT Big data is a term that describes the large volume of data
Internet of Things (IoT) becomes recently a popular –both structured and unstructured. The DBMSs basically
research topic and market reality. According to several can be separated into relational and non-relational DBMS.
research companies, in 2025, up to 75 billion devices are The relational DBMS stores the data rows and columns in
estimated to connect internet and generate an enormous tables with a high data consistency. The most commonly
number of data. This increases in data cause several used open source relational DBMS is MySQL. Non-
difficulties such as the storage cost and processing of such relational databases (NOSQL) have arisen as an
large data. In this paper, we have been studied on alternative to relational databases. The aim of the NOSQL
performance comparison of relational (MySQL) and non- is often not to give guaranty the Atomicity, Consistency,
relational (MongoDB) database management systems for Isolation, and Durability (ACID). The NOSQL does not
storing and processing of this large IoT data. Both types depend on constant table definitions and rigid schemas.
of database management systems have been tested. Columns or records can be added to the collection at any
According to comparison of experimental results, the non- time without exclusive process. Therefore; the number of
relational database management systems, which we records in the columns does not have to equal with each
studied and searched, have provided better performance other. Data sets in IoT environment can change after setup
for storing and processing of large data. of the system, so this environment requires a flexible data
storage system. There are four different storage formats in
Keywords NOSQL namely key-value, columns, document-based
Internet of Things; MySQL; MongoDB; RDBMS; and graff-based.
NRDBMS.
It has been investigated a document-based storage format
INTRODUCTION in NOSQL. In such a system, a record is called document
The Internet of Things (IoT) is a self-configuring and and these documents are usually stored in JSON format
adaptive system that consist of sensor networks and smart [3]. There are various implementation of the NOSQL
objects whose aim is to interconnect all devices/sensors in DBMS such as MongoDB [4] [5], CouchDB [6], HBase
daily life [1]. According to the projections of many [7], Cassandra [8], Amazon SimpleDB [9], and Redis
organizations and companies, up to 75 billion devices will [10]. Since MongoDB is open source and commonly used,
interconnect using the internet and several challenges and it has been chosen MongoDB in this study. There are no
issues that need to be addressed will raise. Therefore, IoT database schemas or tables in MongoDB. MongoDB uses
becomes one of the most popular research topic recent “collection” instead of a table, and “document” instead of
years. Moreover, IoT is closely related to big data and rows to store data. Furthermore, MongoDB uses two
cloud technology. Big data is produced by the different different operations instead of the join operation. These
types of the applications such as industrial processes, are nesting documents inside each other and to store a
medical devices, embedded control systems, gateways, reference to the other document rather than nesting entire
and GPS sensors etc. This means that an amount of data document.
worldwide increases day by day. In 2016, more than 5.5 There are many studies about comparing the performance
million connected devices are inserted every day, and it is of databases [11] [12] [13] [14]. These studies vary
expected that number of devices more than 20.8 billion depending on the data size, the variety of data, the
worldwide by 2020 [2]. Sensors are also produced data differences in databases used, implementation languages,
and they are important for big data growth. The sensor and subjects of the projects. In the study [15], MongoDB,
data is the most popular data type between IoT MySQL, CouchDB, and Redis are compared. It is
applications. declared that MongoDB is performing better among the
comparative database management systems in terms of
the "bulk insert" writing performance. However, MySQL
and MongoDB have similar performance results for
reading operations. Performance parameters between
these DBMSs can be negligible (typically less than 1
second) [15]. However, our test results show that
MongoDB has better performance than MySQL in terms
of reading and writing as represented in “Results of
Experiments” section. MongoDB is utilized for to store
GPS sensor data and to communicate with the analysis
tools such as Apache Mahout [11]. ACID operations on Figure 2.Test Environment
MongoDB and MySQL DBMSs are also applied to
compare them [12] [13]. According to the results, the use writing transactions. In our platform, our target
of the MongoDB has been encouraged for large data throughput is forty messages/milliseconds in average for
applications, especially for applications of big data [12]. writing. Our target throughput and test results compared
In [14], MongoDB, Raven, CouchDB, Cassandra, with “Results of Experiments” section.
HyperTable, CouchBase, and SQL DBMSs are compared The rest of the article is organized as follows. “Test
in terms of the read, write, delete, and instantiate Environment & Methodology” section consists of the
operations. According to results of this study[14], all keys information about the environment of the experiments and
are needed to fetch, MongoDB has better performance methodology. “Results of Experiments” section presents
than the others. the results and graphics from the experiments.
The aim of the experiments is a comparison of the “Conclusions” section summarizes and concludes the
relational and non-relational DBMSs for utilization an IoT experiments and gives a recommendation for future of this
platform. The system includes IoT devices which publish study.
a tremendous number of sensor data where servers store
and process them. Performance of reading and writing TEST ENVIRONMENT & METHODOLOGY
tests has been done in both MYSQL and MongoDB in this In this paper, research is conducted on the relational and
study. The test results which are calculated after the non-relational databases. Server and clients of chosen
application runs at least three times are compared to find DBMSs to be used in our experiments are set up
out where we can store data of IoT considering lowest cost separately in virtual machines with Ubuntu 16.04 Server
in terms of throughput. version. Virtual machines are hosted on a physical
machine (i7 6700HQ, 16 GB DDR3 RAM, and SSD disc).
IoT platform is defined which collects, and processes The virtual machines (4 CPU cores, 4 GB RAM) are
sensor data as seen in Figure 1. The sensor data is executed on the physical servers as depicted in Figure 2.
produced by devices/clients and collected by Data A dedicated network is a setup among the servers.
Storage Server (DSS) in server side. The data is stored in Therefore, it is guaranteed that another network traffıc is
a DBMS through insert and update operations. Data not disrupted the tests. Both DBMSs are installed on the
Query Server (DQS) in the platform provides an interface computer with SSD for the fastest possible read and write
for processing and reporting by Client Application. The speed and they are executed separately during the test
client application sends the data request to the DQS and scenarios. Multithreaded Java applications for reading and
they are formed as a query for DBMS. The result of the writing operations on DBMSs are implemented.
query is delivered to the client application. In our case,
enormous data should be stored in the DBMS. Therefore, Experiments’ constraints are the number of machines,
writing operations are more important than the reading number of threads, number of messages, and the size of
operations. The platform has active-active architecture. the string. These constraints are applied for both DBMSs.
Therefore, more than one DSSs can handle the data The application is executed on a virtual machine;
therefore, the applications are limited in terms of the CPU
core and memory. In this study, since writing operations
are more important than reading operations, our tests
concentrate on the writing operations. In our tests, data
examples are selected as similar to real-time sensor
applications.
In this study, two columns are defined including variable-
length string type, an integer type. A primary key column
is automatically defined in MySQL, but it needs to be
defined in MongoDB. The execution time of the tests is
calculated in milliseconds. The number of messages is
Figure 1.The IoT Platform Architectural Structure measured dividing the total number of messages into the
experiment’s execution time.
MYSQL MySQL; forty threads for one computer and twenty
20,00
threads for two computers and also for MongoDB; eighty
NUMBER OF MESSAGES/TIME 15,00
threads for one computer, forty threads for two computers,
and twenty threads for three computers. It can be seen that
10,00 MongoDB supports more than sixty threads for best
throughput.
5,00
MYSQL
NUMBER OF MESSAGES / TIME
20,00
0,00
500000 1000000 1500000 2000000 4000000
NUMBER OF MESSAGES
1-10 1-20 1-40 1-80 10,00
2-10 2-20 2-40 2-80
0,00
Figure 3.Result Graph of Write Operations on MySQL 500000 1000000 1500000 2000000 4000000
NUMBER OF MESSAGES
2-10-100 2-20-100 2-40-100
RESULTS OF EXPERIMENTS 2-10-1000 2-20-1000 2-40-1000
Insert tests are executed with a multithreaded Java 2-10-2000 2-20-2000 2-40-2000
application. The application sends insert SQL request
which contains a string (100 characters) and an integer Figure 5.Results of String Length Experiments on MySQL
value to DBMSs. The application is executed in
It is also tested the data with different variable string
computers based on the parameters (number of computer
length as depicted in Figure 5 and Figure 6. Since best
and threads) as depicted in Figure 3 and Figure 4. Best
results are obtained from the two computers, only two
throughput is 18.21 messages/millisecond in average for
computers cases are tested in these tests. As it can be seen
MySQL such that is succeeded with two computers each
that 18.21, 9.33, and 6.13 messages/millisecond in
has twenty threads as presented in Figure 3. As it can be
average are obtained for MySQL with 100, 1000, and
seen that forty threads for one and two computers
2000 string length respectively as depicted in Figure 5.
utilization have close results. Best throughput will go up
Similarly, previous results, twenty threads utilization for
when the number of threads increases. However, when the
MySQL has best throughput results. For MongoDB, forty
number of threads reaches eighty, throughput value begins
threads utilization has best throughput results; 70.95,
to decrease. Therefore, it has been decided that MySQL
49.04, and 39.85 messages/millisecond in average are
can manage forty threads for best results.
obtained with 100, 1000, and string (2000 characters)
Similarly, best throughput is 70.95 messages/millisecond respectively as depicted in Figure 6. MongoDB DBMS
in average for MongoDB such that is succeeded with two has about four times better results than MYSQL. As it can
computers each has forty threads as presented in Figure 4. be seen that length of the message is one of the most
To eliminate the effects of the thread switching, the third important parameters. Such as when the length is doubled,
computer is also used for MongoDB. Throughput for throughput is decreased about twenty percent.
utilization of the one, two and three computers is 61.22,
Select tests are also executed with the same multithreaded
70.95, and 61.65 messages/millisecond, respectively. As
Java application. The application retrieves data from two
it can be seen that two computers’ utilization has the best
DBMSs. The application is executed on a computer and
performance. Another result can be obtained from the
two computers with a different number of the threads as
Figure. 3 and Figure. 4 is a correlation between a number
depicted in Figure 7 and Figure 8. MySQL results are not
of threads and computers. Best results are obtained for
80,00 MONGODB 80,00 MONGODB
NUMBER OF MESSAGES / TIME
NUMBER OF MESSAGES/TIME
60,00
60,00
40,00
40,00
20,00
20,00
0,00
500.000 1.000.0001.500.0002.000.0004.000.000 0,00
NUMBER OF MESSAGES 500000 1000000 1500000 2000000 4000000
1-10 1-20 1-40 1-80 NUMBER OF MESSAGES
2-10-100 2-20-100 2-40-100 2-80-100
2-10 2-20 2-40 2-80
3-10 3-20 3-40 3-80 2-10-1000 2-20-1000 2-40-1000 2-80-1000
2-10-2000 2-20-2000 2-40-2000 2-80-2000
Figure 6.Results of String Length Experiments on
Figure 4.Result Graph of Write Operations on MongoDB
MongoDB
The DQS in the IoT platform applies reading operations
80,00 MYSQL on the DBMS. Test results show that MongoDB has better
NUMBER OF MESSAGES/TIME
throughput than the MySQL. MongoDB has 55.07
60,00
messages/millisecond in average and MySQL has 46.66
40,00
messages/millisecond in average for two computers.
Therefore, MongoDB is selected as DBMS for writing
20,00 and reading operations in the IoT Platform.
0,00 ACKNOWLEDGEMENT
1 0 0 . 0 0 0 2 5 0 . 0 0 0 5 0 0 . 0 0 0 1 . 0 0 0 . 0 0 02 . 0 0 0 . 0 0 0 These results preliminary study of the project proposal
NUMBER OF MESSAGES
1-10 1-15 1-20 1-40 1-80 applied to TUBITAK 1505 University Industry
2-10 2-15 2-20 2-40 Collaboration Grant Program and the study is supported
by EMKO Electronic A.Ş located in Bursa, Turkey.
Figure 7.Result Graph of Read Operations on MySQL
REFERENCES
stable as MongoDB. Best throughput is 60.09 [1] “IoT.” [Online]. Available:
messages/millisecond and 44.34 in average for MySQL https://connectedtechnbiz.wordpress.com/tag/internet-of-
things/.
with one computer has eighty threads. In MongoDB, the
best result is 68.68 messages/millisecond and 58.61 [2] “Gartner Says 6.4 Billion Connected ‘Things’ Will Be in Use
messages/millisecond in average is succeeded with three in 2016, Up 30 Percent From 2015,” 2015. [Online].
Available: http://www.gartner.com/newsroom/id/3165317.
computers each has eighty threads as depicted in Figure 8.
Furthermore, it has been concluded that MongoDB can [3] “JSON.” [Online]. Available: http://json.org/.
manage eighty threads for reading operations. [4] “MongoDB Documentation,”
https://docs.mongodb.com/manual/. .
CONCLUSIONS [5] K. Chodorow, Mongo DB: The Definitive Guide. 2013.
Our IoT platform requires that forty
messages/milliseconds in average should be written to the [6] “CouchDB.” [Online]. Available: http://couchdb.apache.org/.
chosen DBMS. Otherwise, the number of messages [7] “HBASE.” [Online]. Available: https://hbase.apache.org/.
waiting in the queue for writing will increase and it can [8] “Cassandra.” [Online]. Available:
cause memory problems. In the IoT platform, active- http://cassandra.apache.org/.
active architecture is applied. Therefore, more than one [9] “Amazon SimpleDB.” [Online]. Available:
computer can write to DBMS at the same time. Test https://aws.amazon.com/simpledb/.
results show that MongoDB has better performance than
[10] “Redis.” [Online]. Available: https://redis.io/.
the MYSQL in terms of both writing and reading
operations. In the IoT Platform, the message payload is [11] G. Aydin, I. R. Hallac, and B. Karakus, “Architecture and
varying between 100 bytes and 200 bytes. For these types implementation of a scalable sensor data storage and analysis
system using cloud computing and big data technologies,” J.
of messages, MongoDB has 70.95 messages/milliseconds Sensors, vol. 2015, 2015.
in average and MySQL has 18.21 messages/milliseconds
[12] S. Chickerur, A. Goudar, and A. Kinnerkar, “Comparison of
in average for writing. As it can be seen that only Relational Database with Document-Oriented Database
MongoDB satisfy the target expectations 40 (MongoDB) for Big Data Applications,” Proc. - 8th Int. Conf.
messages/milliseconds in average. MongoDB also can Adv. Softw. Eng. Its Appl. ASEA 2015, pp. 41–47, 2016.
satisfy the requirement with a single machine which has [13] Z. Parker, S. Poe, and S. V. Vrbsky, “Comparing NoSQL
61.22 messages/milliseconds throughput for writing. MongoDB to an SQL DB,” Proc. 51st ACM Southeast Conf. -
ACMSE ’13, p. 1, 2013.
80,00 MONGODB [14] Y. Li and S. Manoharan, “A performance comparison of SQL
NUMBER OF MESSAGES / TIME
and NoSQL databases,” IEEE Pacific RIM Conf. Commun.
Comput. Signal Process. - Proc., no. August 2013, pp. 15–19,
60,00 2013.
[15] P. T. A. Mai, J. K. Nurminen, and M. Di Francesco, “Cloud
40,00 databases for internet-of-things data,” Proc. - 2014 IEEE Int.
Conf. Internet Things, iThings 2014, 2014 IEEE Int. Conf.
Green Comput. Commun. GreenCom 2014 2014 IEEE Int.
20,00 Conf. Cyber-Physical-Social Comput. CPS 20, no. iThings,
100.000 250.000 500.000 1.000.000 2.000.000 pp. 117–124, 2014.
NUMBER OF MESSAGES
1-10 1-15 1-20 1-40 1-80
2-10 2-15 2-20 2-40 2-80
3-10 3-15 3-20 3-40 3-80
Figure 8.Result Graph of Read Operations on MongoDB
BIOGRAPHY(S)
Gizem Kiraz Cengiz Toğay
Gizem Kiraz has completed her Cengiz Togay, Ph.D. is an assistant
undergraduate (2016) in Computer professor at Uludag University's
Engineering Department from the Computer Engineering Department.
Pamukkale University (PAÜ). Her He obtained his undergraduate (1999)
postgraduate has started (2017) in and MS (2001) in computer
Computer Engineering Department engineering from the Canakkale
from the Uludag University. Onsekiz Mart University and his PhD
Currently, she is studying on the (2008) in computer engineering
Internet of Things. department from Middle East
Technical University. He has national and international
patent applications, papers, articles and projects about
Software Engineering, Secure Communications, Smart
Cards, and Internet of Things(IoT).