<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Data Performance Evaluation of Cloud Storage Providers</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Aleksandar</forename><surname>Dimov</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Mathematics and Informatics</orgName>
								<orgName type="institution">Sofia University</orgName>
								<address>
									<addrLine>5 James Bourchier Blvd</addrLine>
									<postCode>1164</postCode>
									<settlement>Sofia</settlement>
									<country>Bulgatia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Stanimir</forename><surname>Kirov</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Mathematics and Informatics</orgName>
								<orgName type="institution">Sofia University</orgName>
								<address>
									<addrLine>5 James Bourchier Blvd</addrLine>
									<postCode>1164</postCode>
									<settlement>Sofia</settlement>
									<country>Bulgatia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Data Performance Evaluation of Cloud Storage Providers</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">78344BF68719985A3853E99DAAB4609D</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T01:08+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Data performance comparison</term>
					<term>cloud storage providers</term>
					<term>data-intensive systems Operation Average Times (mm:ss) Google Drive OneDrive DropBox Create share 00:01.7 00:02.4 00:01.3 List share 00:00.8 00:01.4 00:00.9 Move share 00:01.3 00:02.5 00:02.0 copy share 00:01.4 00:02.8 00:01.4 delete share 00:01.2 00:01.8 00:01.4 Upload 03:57.4 04:36.2 04:32.7 Download 05:41.3 06:06.8 02:30.1 Copy 00:04.7 00:06.7 00:02.9 Delete 00:02.2 00:03.1 00:01.7 Duration of all tests 09:52.0 11:03.8 07:14.3</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Many of the current software systems are data-intensive which presents many new challenges not only to IT and to software professionals but also to business and individual users. Some of these challenges are related to decisions on how to store the data that data-intensive systems work with. One common solution is to use cloud storage, which most often is offered by third party. This paper presents a methodology for evaluation of cloud storage providers in the realm of data-intensive systems, based on the fundamental operations that are provided by their services. Further, it also makes a performance comparison of some of the popular cloud storage services in terms of the operations execution times.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>An important concern in the realm of data-intensive systems is how users and businesses are going to store their data. Both regular and businesses users are increasingly credulous on cloud-based storage solutions instead of on-premises local storage hardware. Most significant reasons for this include security, availability, scalability and cost-effectiveness. More and more recognizable nowadays is the tendency to migrate data to the cloud or to take seriously the ability to base on the cloud when developing new solutions. In this sense, software engineers and IT professionals are interested to have means for well-informed selection of specific solutions, based on quality of service.</p><p>Additionally, most of the contemporary systems are data-intensive <ref type="bibr" target="#b0">[1]</ref>, <ref type="bibr" target="#b1">[2]</ref>, which means that they heavily rely on data storage and quality characteristics of such storage. Such systems also often perform data analysis and analytical processing which may be required to happen in real time. In these terms, it becomes especially significant to optimize performance of such systems.</p><p>Information Systems &amp; Grid Technologies: Fifteenth International Conference ISGT'2022, May 27-28, 2022, Sofia, Bulgaria However, in current environment, it may become difficult to select appropriate cloud storage provider, as there exist a lot such services. Users need means to select the best option in a straightforward way. One of the first things someone should do when choosing between cloud services is compare storage options, features, and costs. Next, it is the dependence on a single vendor for so many critical needs. If your data is in the hands of one service provider, the dependence on your provider is huge. To avoid this, users may implement multi-cloud architecture. By using multi-cloud storage connection tool, one can easily switch between cloud service providers that are supported by the tool.</p><p>The goal of this paper is to provide a methodological framework for testing of cloud storage providers and show particular results on some of the most popular free storage services. The research question employed by this study is "What are the main factors that users should employ to evaluate cloud storage solutions and how to pick provider that is right for their needs?".</p><p>The rest of the paper is structured as follows: Section 2 makes an overview of the related work in the area; Section 3 presents the methodological framework of our approach to testing of cloud storage providers; Section 4 describes the specifics of the testing environment and experiments, we have made; Section 5 presents and analyse the results and finally, section 6 concludes the paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related work</head><p>There exist a number of research works that directly relate to our and aim at performance comparison of cloud storage providers.</p><p>Like <ref type="bibr" target="#b4">[5]</ref>, where a comparison between Google Cloud Service and iCloud is made by exploration of the features of these two cloud storage services.</p><p>In <ref type="bibr" target="#b5">[6]</ref>, the authors have tested performance of several cloud storage providers including Google Drive and Dropbox and have analysed their applicability in healthcare services by using medical image files for testing and comparison. Like what is shown in this paper, a comparison was based on time duration of several commands, including upload, download, and file deletion.</p><p>Another comparison of some popular cloud storage services is provided in <ref type="bibr" target="#b6">[7]</ref>. The authors aim to help users to choose the right cloud service for storage by making a comparison on 10 different factors, including performance. It is evaluated by upload and download of files of two different sizes.</p><p>There also exist several non-academic surveys <ref type="bibr" target="#b2">[3]</ref>, <ref type="bibr" target="#b3">[4]</ref> that try to rate cloud storage providers, however they do not focus on methodological approach to testing but rather just compare the properties of different plans that cloud storage providers offer.</p><p>Another direction of research that have some relation to our work concern performance testing of various cloud services. Like in <ref type="bibr" target="#b7">[8]</ref>, where High-Perfor-mance Computing (HPC) is evaluated in terms of performance comparison of Google services Cloud Functions (Function-as-a-service) and Compute Engine (Infrastructure-as-a-Service).</p><p>In conclusion, there exist a lot of work in comparison of cloud services and cloud storage in particular. However, in this paper we are trying to fill the gap in relation to the cloud evaluation with respect to data-intensive systems. For this purpose, in next section we present our methodology for testing performance, which is specifically targeted at storage service operations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Comparison methodology</head><p>This section will explain the methodological approach for comparison between different cloud storage providers.</p><p>The test environment should be fully isolated from other applications, in order to prevent data interference. An additional application is also needed to provide a bridge between the test environment and cloud providers under test. It will also serve as a wrapper that will allow access to different cloud providers and provide the same and fair conditions for all of them.</p><p>We will perform the test following three main phases: 1. pre-test phase -a share is created, which is going to be used in the test phase to check the performance of cloud data storage providers 2. test execution phase -this phase consists of execution of 9 operations common for each operating system and execution time is measured for each of them. These operations are the following: a) Create share -this operation is used to create a location for storing files; b) List share -this operation is used to show files in the share\directory listed; c) Move share -this operation is used to move a directory and subdirectories (if available) and files within the share; d) Copy share -this operation is used to copy a directory and subdirectories (if available) and files within the share; e) Delete share -this operation is used to delete a directory and subdirectories (if available) and files within the share; f) Upload file -this operation is used to transfer data from source (computer\PC) to destination (cloud share in this project case); g) Download file -this operation is used to transfer data from source (cloud share) to destination (computer\PC); h) Copy file -this operation is used to copy files; i) Delete file -This operation is used to remove a file from the file system in Create share.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">post-test phase -this phase has the duty to prepare for the next iteration</head><p>of the test execution phase. It includes cleaning the test file that that was created during the previous phase. This is needed since free accounts are used that have limited storage space.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 1: Cloud providers testing methodology</head><p>Testing of a cloud storage provider should be performed while treating it as a black box. Normally, one should not be able to get any kind of internal information for cloud architecture infrastructure as this is considered as security breach and if that happens the cloud infrastructure could be classified as highly unreliable. This way, we are going to use opaque testing technique. With this technique, only the fundamental aspects of the system are being explored. In that way, more data may be collected and conclusions can be very accurate regarding different cloud storage vendor's behaviour and response according to our setup.</p><p>In order to perform the test, we should ensure the following requirements that are supposed to the fairest test conditions:</p><p>1. Single platform or application should be used to access different cloud storage providers.</p><p>2. Virtualization should be used, which is limited to a single virtual machine. This will provide an isolated environment and is a safe, efficient, cheap and flexible way to test applications -one can test everything from server configurations to resource allocation and most importantly for us -storage.</p><p>3. The operating system should be less demanding and have good handling of resources so it can have less interference with the application and the test results to be believed as accurate as they can.</p><p>4. It should be considered that cloud storage had different characteristics for different uses (different end users or companies could make use of the service in different ways). For this reason, we focused only on file-system based operations and we will use a single application to access different cloud solutions for storage service offered by vendors.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Building the testing environment</head><p>We are going to use Rclone<ref type="foot" target="#foot_0">2</ref> command-line tool as an intermediary application between a client and cloud provider service. This way the integration is provided between them. Rclone, is an instrument written in Go programming language which is used to download\upload data from computer to a cloud hosted data storage centre. It can connect to various cloud storage centres. This way, the requirement for a single platform to have access to different cloud storage services offered by vendors is going to be fulfilled.</p><p>Another objective of using Rclone command-line tool is to produce multiservice cloud delivery model. By developing and implementing it, we can compare supported storage services from a performance perspective. The architecture of the test environment built is shown on Figure <ref type="figure" target="#fig_0">2</ref>. To provide virtualization, Oracle VirtualBox is used. It is a deceptively simple, but powerful and free to use cross-platform virtualization application for x86 hardware, targeted at server, desktop and embedded use <ref type="bibr" target="#b4">[5]</ref>.</p><p>As an operating system the CentOS Linux distribution was used, as it is a stable, predictable, manageable, and reproducible platform derived from the sources of Red Hat Enterprise Linux <ref type="bibr" target="#b5">[6]</ref>, <ref type="bibr" target="#b6">[7]</ref>. It is available free of charge and technical support is primarily provided by the community via official mailing lists, web forums, and chat rooms. Other reasons for it to be chosen for our work is that it has good documentation; it is highly customizable and is supported by Virtualbox.</p><p>As defined in the methodology description in Section 3, we have to implement the operations that are most used on storage. In the list below each operation is shown together with the specific Rclone command that was used to execute it: </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Experiment results and analysis</head><p>This section presents the results of performance comparison of cloud service providers. After presenting particular and average times for execution of each command listed in previous section, we also make some analysis of the different providers based also on their pricing plans. To perform the test a single 1GB file with randomly generated contents is used.</p><p>All times shown in the tables and figures in this section, as well as in the appendixare in the format minutes:seconds (mm:ss).</p><p>The experiment described in this section was undertaken under two important assumptions:</p><p>• We are going to test only free services delivered by cloud storage providers. This is an important assumption, because given cloud service provider may limit the resources available to their free tier services, while increasing or removing the said limit for the paid plans.</p><p>• Analysis of pricing plans of cloud storage providers has been made only about per month plans of each provider and for personal users only. It is important, because many providers may offer additional services on top of storage, which may influence the price of storage. Cloud providers also offer additional subscriptions, like annual ones, family plans, business, and enterprise plans, etc., which may vary significantly in terms of pricing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Test Results</head><p>Results of the tests performed given the environment and methodology, described in section 4 are shown in Table <ref type="table">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1 Performance results of Cloud storage providers</head><p>As seen from Table <ref type="table">1</ref>, the performance of the three compared cloud storage providers is similar, with slight underperformance of OneDrive in Share/directory operations (Figure <ref type="figure" target="#fig_1">3</ref>). However, performance of all three providers in upload/ download is similar (Figure <ref type="figure" target="#fig_2">4</ref>). More detailed table with test results is presented in Appendix 1.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Pricing plans evaluation</head><p>All cloud storage providers have consumer storage plans and support different storage plans for business. Here we are going to focus on consumer storage plans. Please note that all prices refer to individual accounts and they are not options for businesses. Also depending on the plan every provider gives you bonus features that are not part of our research.</p><p>This research shows that, the pricing plans of tested cloud storage providers are almost the same. However, it should be noted that Google Drive offers the largest storage space in their free plan. It is also one of the most generous cloud storage providers with their plans even if the free plan of the storage is shared between different services that they offer. At first glance, Dropbox probably have the best pricing offers for bigger storage needs and offer the best price per space ratio. However, it should also be noted that most providers, together with the storage offer users also a large number of other services as well. This requires a more complex methodology and criteria for pricing comparison of cloud storage providers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>In terms of data-intensive systems, it is worth to be able to evaluate different storage options available for small business and individual users. In contemporary systems, most data is stored over the cloud using the services of different cloud storage providers. This paper presents a methodological framework to evaluate cloud storage providers in terms of their performance parameters. It also presents details on specific testing environment and results from testing the performance of three popular cloud providers that also offer free storage options. Additionally, a comparison of the pricing plans of these providers is performed; however, it is difficult to assess them in this respect, as most subscriptions include other service, besides storage.</p><p>It should be noted that a certain drawback of cloud solutions is represented by bandwidth limitations and the end user network is very important part of the cloud service. If the network is slow and unstable it may trouble accessing or sharing files and even, make impossible to work on this kind of environment. However, investigation on how end user network affect performance of cloud storage providers is part of our further research.</p><p>Directions for future research include:</p><p>• Increasing the comparison with more service providers • Development of methodology for comparison of other quality characteristics of cloud storage providers like reliability, availability, security and cost-effectiveness. It may also appear beneficial to define a compound measure for cloud storage quality of service, by combining the results of the various tests of such characteristics.</p><p>• The experiment may be expanded to include more diverse tests, for example with various file sizes, single transaction with large number of files (both small and large ones), and etc.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Architecture of a multi-cloud storage performance test</figDesc><graphic coords="5,51.02,313.16,354.33,174.97" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Average time (mm:ss) of share and directory operations</figDesc><graphic coords="8,88.45,57.92,302.16,171.12" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Average time (mm:ss) of upload and download file operations (file size is 1 GB)</figDesc><graphic coords="8,90.97,261.11,297.12,169.44" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="4,85.81,116.48,307.44,194.88" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Cloud storage providers pricing comparison</figDesc><table><row><cell cols="2">Google drive</cell><cell></cell><cell>OneDrive</cell><cell></cell><cell>Dropbox</cell></row><row><cell cols="5">Storage Price per month Storage Price per month Storage</cell><cell>Price per month</cell></row><row><cell>15 GB</cell><cell>Free</cell><cell>5 GB</cell><cell>Free</cell><cell>2 GB</cell><cell>Free</cell></row><row><cell>30 GB</cell><cell>6$</cell><cell>100 GB</cell><cell>1.99$</cell><cell>2 TB</cell><cell>9.99$ starting</cell></row><row><cell>2 TB</cell><cell>12$</cell><cell>1 TB</cell><cell>6.99$</cell><cell>3 TB</cell><cell>16.58$ starting</cell></row><row><cell>5 TB</cell><cell>18$</cell><cell>6 TB</cell><cell>9.99$</cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">https://rclone.org</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Acknowledgements</head><p>Research presented in this paper is partially supported by the Sofia University "St. Kliment Ohridski" Research Science Fund project No. 80-10-145/23.05.2022 -"Data intensive software architectures".</p><p>Authors of the paper are also grateful to the anonymous reviewers for their valuable comments and remarks, which helped to increase the quality of the paper.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Appendix: Detailed results of performance comparison</head></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Designing Data-Intensive Applications</title>
		<author>
			<persName><forename type="first">M</forename><surname>Kleppmann</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
			<publisher>O&apos;Reilly</publisher>
			<pubPlace>Beijing</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Jim Gray on eScience: a transformed scientific method</title>
		<author>
			<persName><forename type="first">T</forename><surname>Hey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tansley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">M</forename><surname>Tolle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Fourth Paradigm</title>
				<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<ptr target="https://cloudstorageinfo.org/top-10-cloud-storage-providers" />
		<title level="m">Best cloud storage providers</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<ptr target="https://www.goodcloudstorage.net/cloud-storage-comparison" />
		<title level="m">Cloud Storage Comparison -Table Chart</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A Comparison between Google Cloud Service and iCloud</title>
		<author>
			<persName><forename type="first">H</forename><surname>Arif</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Hajjdiab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">A</forename><surname>Harbi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename></persName>
		</author>
		<idno type="DOI">10.1109/CCOMS.2019.8821744</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE 4th International Conference on Computer and Communication Systems (ICCCS)</title>
				<imprint>
			<date type="published" when="2019">2019. 2019</date>
			<biblScope unit="page" from="337" to="340" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Performance Evaluation and Comparison of Various Personal Cloud Storage Services for Healthcare Images</title>
		<author>
			<persName><forename type="first">M</forename><surname>Roy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Singh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Cyber Intelligence and Information Retrieval. Lecture Notes in Networks and Systems</title>
				<editor>
			<persName><forename type="first">J</forename><forename type="middle">M R S</forename><surname>Tavares</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Dutta</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Dutta</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Samanta</surname></persName>
		</editor>
		<meeting><address><addrLine>Singapore</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">291</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Cloud storage providers: A comparison review and evaluation</title>
		<author>
			<persName><surname>Zenuni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jaumin</forename><forename type="middle">&amp;</forename><surname>Ajdari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Florie</forename><forename type="middle">&amp;</forename><surname>Ismaili</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bujar</forename><surname>Raufi</surname></persName>
		</author>
		<idno type="DOI">10.1145/2659532.2659609</idno>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="volume">883</biblScope>
			<biblScope unit="page" from="272" to="277" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">HPC in the cloud: Performance comparison of function as a service (FaaS) vs infrastructure as a service (IaaS)</title>
		<author>
			<persName><forename type="first">S</forename><surname>Malla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Christensen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Internet Technology Letters</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">e137</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
