<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">IoT Data Storage: Relational &amp; Non-Relational Database Management Systems Performance Comparison</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Gizem</forename><surname>Kiraz</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Computer Engineering</orgName>
								<orgName type="institution">Uludag University Gorukle</orgName>
								<address>
									<settlement>Bursa</settlement>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Cengiz</forename><surname>Toğay</surname></persName>
							<email>ctogay@uludag.edu.tr</email>
							<affiliation key="aff1">
								<orgName type="department">Computer Engineering</orgName>
								<orgName type="institution">Uludag University Gorukle</orgName>
								<address>
									<settlement>Bursa</settlement>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">IoT Data Storage: Relational &amp; Non-Relational Database Management Systems Performance Comparison</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">69ABD266DB7097BF294453A715B8C78D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T09:02+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Internet of Things</term>
					<term>MySQL</term>
					<term>MongoDB</term>
					<term>RDBMS</term>
					<term>NRDBMS</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Internet of Things (IoT) becomes recently a popular research topic and market reality. According to several research companies, in 2025, up to 75 billion devices are estimated to connect internet and generate an enormous number of data. This increases in data cause several difficulties such as the storage cost and processing of such large data. In this paper, we have been studied on performance comparison of relational (MySQL) and nonrelational (MongoDB) database management systems for storing and processing of this large IoT data. Both types of database management systems have been tested. According to comparison of experimental results, the nonrelational database management systems, which we studied and searched, have provided better performance for storing and processing of large data.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>INTRODUCTION</head><p>The Internet of Things (IoT) is a self-configuring and adaptive system that consist of sensor networks and smart objects whose aim is to interconnect all devices/sensors in daily life <ref type="bibr" target="#b0">[1]</ref>. According to the projections of many organizations and companies, up to 75 billion devices will interconnect using the internet and several challenges and issues that need to be addressed will raise. Therefore, IoT becomes one of the most popular research topic recent years. Moreover, IoT is closely related to big data and cloud technology. Big data is produced by the different types of the applications such as industrial processes, medical devices, embedded control systems, gateways, and GPS sensors etc. This means that an amount of data worldwide increases day by day. In 2016, more than 5.5 million connected devices are inserted every day, and it is expected that number of devices more than 20.8 billion worldwide by 2020 <ref type="bibr" target="#b1">[2]</ref>. Sensors are also produced data and they are important for big data growth. The sensor data is the most popular data type between IoT applications.</p><p>Big data is a term that describes the large volume of data -both structured and unstructured. The DBMSs basically can be separated into relational and non-relational DBMS. The relational DBMS stores the data rows and columns in tables with a high data consistency. The most commonly used open source relational DBMS is MySQL. Nonrelational databases (NOSQL) have arisen as an alternative to relational databases. The aim of the NOSQL is often not to give guaranty the Atomicity, Consistency, Isolation, and Durability (ACID). The NOSQL does not depend on constant table definitions and rigid schemas. Columns or records can be added to the collection at any time without exclusive process. Therefore; the number of records in the columns does not have to equal with each other. Data sets in IoT environment can change after setup of the system, so this environment requires a flexible data storage system. There are four different storage formats in NOSQL namely key-value, columns, document-based and graff-based.</p><p>It has been investigated a document-based storage format in NOSQL. In such a system, a record is called document and these documents are usually stored in JSON format <ref type="bibr" target="#b2">[3]</ref>. There are various implementation of the NOSQL DBMS such as MongoDB <ref type="bibr" target="#b3">[4]</ref> [5], CouchDB <ref type="bibr" target="#b5">[6]</ref>, HBase <ref type="bibr" target="#b6">[7]</ref>, Cassandra <ref type="bibr" target="#b7">[8]</ref>, Amazon SimpleDB <ref type="bibr" target="#b8">[9]</ref>, and Redis <ref type="bibr" target="#b9">[10]</ref>. Since MongoDB is open source and commonly used, it has been chosen MongoDB in this study. There are no database schemas or tables in MongoDB. MongoDB uses "collection" instead of a table, and "document" instead of rows to store data. Furthermore, MongoDB uses two different operations instead of the join operation. These are nesting documents inside each other and to store a reference to the other document rather than nesting entire document.</p><p>There are many studies about comparing the performance of databases <ref type="bibr">[11] [12]</ref> [13] <ref type="bibr" target="#b13">[14]</ref>. These studies vary depending on the data size, the variety of data, the differences in databases used, implementation languages, and subjects of the projects. In the study <ref type="bibr" target="#b14">[15]</ref>, MongoDB, MySQL, CouchDB, and Redis are compared. It is declared that MongoDB is performing better among the comparative database management systems in terms of the "bulk insert" writing performance. However, MySQL and MongoDB have similar performance results for reading operations. Performance parameters between these DBMSs can be negligible (typically less than 1 second) <ref type="bibr" target="#b14">[15]</ref>. However, our test results show that MongoDB has better performance than MySQL in terms of reading and writing as represented in "Results of Experiments" section. MongoDB is utilized for to store GPS sensor data and to communicate with the analysis tools such as Apache Mahout <ref type="bibr" target="#b10">[11]</ref>. ACID operations on MongoDB and MySQL DBMSs are also applied to compare them <ref type="bibr">[12] [13]</ref>. According to the results, the use of the MongoDB has been encouraged for large data applications, especially for applications of big data <ref type="bibr" target="#b11">[12]</ref>. In <ref type="bibr" target="#b13">[14]</ref>, MongoDB, Raven, CouchDB, Cassandra, HyperTable, CouchBase, and SQL DBMSs are compared in terms of the read, write, delete, and instantiate operations. According to results of this study <ref type="bibr" target="#b13">[14]</ref>, all keys are needed to fetch, MongoDB has better performance than the others.</p><p>The aim of the experiments is a comparison of the relational and non-relational DBMSs for utilization an IoT platform. The system includes IoT devices which publish a tremendous number of sensor data where servers store and process them. Performance of reading and writing tests has been done in both MYSQL and MongoDB in this study. The test results which are calculated after the application runs at least three times are compared to find out where we can store data of IoT considering lowest cost in terms of throughput.</p><p>IoT platform is defined which collects, and processes sensor data as seen in Figure <ref type="figure" target="#fig_0">1</ref>. The sensor data is produced by devices/clients and collected by Data Storage Server (DSS) in server side. The data is stored in a DBMS through insert and update operations. Data Query Server (DQS) in the platform provides an interface for processing and reporting by Client Application. The client application sends the data request to the DQS and they are formed as a query for DBMS. The result of the query is delivered to the client application. In our case, enormous data should be stored in the DBMS. Therefore, writing operations are more important than the reading operations. The platform has active-active architecture. Therefore, more than one DSSs can handle the data  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>TEST ENVIRONMENT &amp; METHODOLOGY</head><p>In this paper, research is conducted on the relational and non-relational databases. Server and clients of chosen DBMSs to be used in our experiments are set up separately in virtual machines with Ubuntu 16.04 Server version. Virtual machines are hosted on a physical machine (i7 6700HQ, 16 GB DDR3 RAM, and SSD disc). The virtual machines (4 CPU cores, 4 GB RAM) are executed on the physical servers as depicted in Figure <ref type="figure" target="#fig_1">2</ref>. A dedicated network is a setup among the servers. Therefore, it is guaranteed that another network traffıc is not disrupted the tests. Both DBMSs are installed on the computer with SSD for the fastest possible read and write speed and they are executed separately during the test scenarios. Multithreaded Java applications for reading and writing operations on DBMSs are implemented.</p><p>Experiments' constraints are the number of machines, number of threads, number of messages, and the size of the string. These constraints are applied for both DBMSs. The application is executed on a virtual machine; therefore, the applications are limited in terms of the CPU core and memory. In this study, since writing operations are more important than reading operations, our tests concentrate on the writing operations. In our tests, data examples are selected as similar to real-time sensor applications.</p><p>In this study, two columns are defined including variablelength string type, an integer type. A primary key column is automatically defined in MySQL, but it needs to be defined in MongoDB. The execution time of the tests is calculated in milliseconds. The number of messages is measured dividing the total number of messages into the experiment's execution time.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 3.Result Graph of Write Operations on MySQL</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>RESULTS OF EXPERIMENTS</head><p>Insert tests are executed with a multithreaded Java application. The application sends insert SQL request which contains a string (100 characters) and an integer value to DBMSs. The application is executed in computers based on the parameters (number of computer and threads) as depicted in Figure <ref type="figure">3</ref> and Figure <ref type="figure" target="#fig_2">4</ref>. Best throughput is 18.21 messages/millisecond in average for MySQL such that is succeeded with two computers each has twenty threads as presented in Figure <ref type="figure">3</ref>. As it can be seen that forty threads for one and two computers utilization have close results. Best throughput will go up when the number of threads increases. However, when the number of threads reaches eighty, throughput value begins to decrease. Therefore, it has been decided that MySQL can manage forty threads for best results.</p><p>Similarly, best throughput is 70.95 messages/millisecond in average for MongoDB such that is succeeded with two computers each has forty threads as presented in Figure <ref type="figure" target="#fig_2">4</ref>. To eliminate the effects of the thread switching, the third computer is also used for MongoDB. Throughput for utilization of the one, two and three computers is 61.22, 70.95, and 61.65 messages/millisecond, respectively. As it can be seen that two computers' utilization has the best performance. Another result can be obtained from the Figure. MySQL; forty threads for one computer and twenty threads for two computers and also for MongoDB; eighty threads for one computer, forty threads for two computers, and twenty threads for three computers. It can be seen that MongoDB supports more than sixty threads for best throughput.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 5.Results of String Length Experiments on MySQL</head><p>It is also tested the data with different variable string length as depicted in Figure <ref type="figure">5</ref> and Figure <ref type="figure">6</ref>. Since best results are obtained from the two computers, only two computers cases are tested in these tests. As it can be seen that 18.21, 9.33, and 6.13 messages/millisecond in average are obtained for MySQL with 100, 1000, and 2000 string length respectively as depicted in Figure <ref type="figure">5</ref>. Similarly, previous results, twenty threads utilization for MySQL has best throughput results. For MongoDB, forty threads utilization has best throughput results; 70.95, 49.04, and 39.85 messages/millisecond in average are obtained with 100, 1000, and string (2000 characters) respectively as depicted in Figure <ref type="figure">6</ref>. MongoDB DBMS has about four times better results than MYSQL. As it can be seen that length of the message is one of the most important parameters. Such as when the length is doubled, throughput is decreased about twenty percent.</p><p>Select tests are also executed with the same multithreaded Java application. The application retrieves data from two DBMSs. The application is executed on a computer and two computers with a different number of the threads as depicted in Figure <ref type="figure">7</ref>   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CONCLUSIONS</head><p>Our IoT platform requires that forty messages/milliseconds in average should be written to the chosen DBMS. Otherwise, the number of messages waiting in the queue for writing will increase and it can cause memory problems. In the IoT platform, activeactive architecture is applied. Therefore, more than one computer can write to DBMS at the same time. Test results show that MongoDB has better performance than the MYSQL in terms of both writing and reading operations. In the IoT Platform, the message payload is varying between 100 bytes and 200 bytes. For these types of messages, MongoDB has 70.95 messages/milliseconds in average and MySQL has 18.21 messages/milliseconds in average for writing. As it can be seen that only MongoDB satisfy the target expectations 40 messages/milliseconds in average. MongoDB also can satisfy the requirement with a single machine which has 61.22 messages/milliseconds throughput for writing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 8.Result Graph of Read Operations on MongoDB</head><p>The DQS in the IoT platform applies reading operations on the DBMS. Test results show that MongoDB has better throughput than the MySQL. MongoDB has 55.07 messages/millisecond in average and MySQL has 46.66 messages/millisecond in average for two computers. Therefore, MongoDB is selected as DBMS for writing and reading operations in the IoT Platform. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 .</head><label>1</label><figDesc>Figure 1.The IoT Platform Architectural Structure</figDesc><graphic coords="2,70.90,566.51,223.74,111.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 .</head><label>2</label><figDesc>Figure 2.Test Environment writing transactions. In our platform, our target throughput is forty messages/milliseconds in average for writing. Our target throughput and test results compared with "Results of Experiments" section. The rest of the article is organized as follows. "Test Environment &amp; Methodology" section consists of the information about the environment of the experiments and methodology. "Results of Experiments" section presents the results and graphics from the experiments. "Conclusions" section summarizes and concludes the experiments and gives a recommendation for future of this study.</figDesc><graphic coords="2,325.15,42.55,227.40,94.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 .</head><label>4</label><figDesc>Figure 4.Result Graph of Write Operations on MongoDB</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>and Figure 8 .Figure 6 .Figure 7 .</head><label>867</label><figDesc>Figure 6.Results of String Length Experiments on MongoDB</figDesc></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ACKNOWLEDGEMENT</head><p>These results preliminary study of the project proposal applied to TUBITAK 1505 University Industry Collaboration Grant Program and the study is supported by EMKO Electronic A.Ş located in Bursa, Turkey.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>BIOGRAPHY(S)</head><note type="other">Gizem</note></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<ptr target="https://connectedtechnbiz.wordpress.com/tag/internet-of-things/" />
		<title level="m">IoT</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Gartner Says 6.4 Billion Connected &apos;Things&apos; Will Be in Use in 2016</title>
		<ptr target="http://www.gartner.com/newsroom/id/3165317" />
	</analytic>
	<monogr>
		<title level="m">Up 30 Percent From 2015</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<ptr target="http://json.org/" />
		<title level="m">JSON</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<ptr target="https://docs.mongodb.com/manual/" />
		<title level="m">MongoDB Documentation</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Mongo DB: The Definitive Guide</title>
		<author>
			<persName><forename type="first">K</forename><surname>Chodorow</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<ptr target="http://couchdb.apache.org/" />
		<title level="m">CouchDB</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">HBASE</title>
		<ptr target="https://hbase.apache.org/" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Cassandra</title>
		<ptr target="http://cassandra.apache.org/" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Amazon SimpleDB</title>
		<ptr target="https://aws.amazon.com/simpledb/" />
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<ptr target="https://redis.io/" />
		<title level="m">Redis</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Architecture and implementation of a scalable sensor data storage and analysis system using cloud computing and big data technologies</title>
		<author>
			<persName><forename type="first">G</forename><surname>Aydin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">R</forename><surname>Hallac</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Karakus</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Sensors</title>
		<imprint>
			<biblScope unit="volume">2015</biblScope>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Comparison of Relational Database with Document-Oriented Database (MongoDB) for Big Data Applications</title>
		<author>
			<persName><forename type="first">S</forename><surname>Chickerur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Goudar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kinnerkar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. -8th Int. Conf. Adv. Softw. Eng. Its Appl. ASEA 2015</title>
				<meeting>-8th Int. Conf. Adv. Softw. Eng. Its Appl. ASEA 2015</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="41" to="47" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Comparing NoSQL MongoDB to an SQL DB</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Parker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Poe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">V</forename><surname>Vrbsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. 51st ACM Southeast Conf. -ACMSE &apos;13</title>
				<meeting>51st ACM Southeast Conf. -ACMSE &apos;13</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">A performance comparison of SQL and NoSQL databases</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Manoharan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Pacific RIM Conf. Commun. Comput. Signal Process. -Proc</title>
				<imprint>
			<date type="published" when="2013-08">August 2013. 2013</date>
			<biblScope unit="page" from="15" to="19" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Cloud databases for internet-of-things data</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">T A</forename><surname>Mai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">K</forename><surname>Nurminen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">Di</forename><surname>Francesco</surname></persName>
		</author>
		<idno>1-15 1-20 1-40 1-80 2-10 2-15 2-20 2-40 2-80 3-10 3-15 3-20 3-40 3-80</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Int. Conf. Green Comput. Commun. GreenCom 2014 2014 IEEE Int. Conf. Cyber-Physical-Social Comput. CPS 20</title>
				<imprint>
			<publisher>iThings</publisher>
			<date type="published" when="2014">2014. 2014</date>
			<biblScope unit="page" from="1" to="10" />
		</imprint>
	</monogr>
	<note>Proc. -2014 IEEE Int. Conf. Internet Things, iThings 2014</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
