=Paper=
{{Paper
|id=Vol-3191/paper06
|storemode=property
|title=Data Performance Evaluation of Cloud Storage Providers
|pdfUrl=https://ceur-ws.org/Vol-3191/paper06.pdf
|volume=Vol-3191
|authors=Aleksandar Dimov,Stanimir Kirov
|dblpUrl=https://dblp.org/rec/conf/isgt2/DimovK22
}}
==Data Performance Evaluation of Cloud Storage Providers==
Data Performance Evaluation of Cloud Storage
Providers
Aleksandar Dimov 1 and Stanimir Kirov 1
1
Faculty of Mathematics and Informatics, Sofia University, 5 James Bourchier Blvd.,
Sofia, 1164, Bulgatia
Abstract
Many of the current software systems are data-intensive which presents
many new challenges not only to IT and to software professionals but also
to business and individual users. Some of these challenges are related to
decisions on how to store the data that data-intensive systems work with.
One common solution is to use cloud storage, which most often is offered
by third party. This paper presents a methodology for evaluation of cloud
storage providers in the realm of data-intensive systems, based on the
fundamental operations that are provided by their services. Further, it also
makes a performance comparison of some of the popular cloud storage
services in terms of the operations execution times.
Keywords
Data performance comparison; cloud storage providers; data-intensive sys-
tems
1. Introduction
An important concern in the realm of data-intensive systems is how users
and businesses are going to store their data. Both regular and businesses users are
increasingly credulous on cloud-based storage solutions instead of on-premises
local storage hardware. Most significant reasons for this include security, avail-
ability, scalability and cost-effectiveness. More and more recognizable nowadays
is the tendency to migrate data to the cloud or to take seriously the ability to base
on the cloud when developing new solutions. In this sense, software engineers
and IT professionals are interested to have means for well-informed selection of
specific solutions, based on quality of service.
Additionally, most of the contemporary systems are data-intensive [1], [2],
which means that they heavily rely on data storage and quality characteristics of
such storage. Such systems also often perform data analysis and analytical pro-
cessing which may be required to happen in real time. In these terms, it becomes
especially significant to optimize performance of such systems.
Information Systems & Grid Technologies: Fifteenth International Conference ISGT’2022, May 27–28, 2022, Sofia, Bulgaria
© 2022 Copyright for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)
However, in current environment, it may become difficult to select appropri-
ate cloud storage provider, as there exist a lot such services. Users need means to
select the best option in a straightforward way. One of the first things someone
should do when choosing between cloud services is compare storage options, fea-
tures, and costs. Next, it is the dependence on a single vendor for so many critical
needs. If your data is in the hands of one service provider, the dependence on
your provider is huge. To avoid this, users may implement multi-cloud architec-
ture. By using multi-cloud storage connection tool, one can easily switch between
cloud service providers that are supported by the tool.
The goal of this paper is to provide a methodological framework for testing
of cloud storage providers and show particular results on some of the most popu-
lar free storage services. The research question employed by this study is “What
are the main factors that users should employ to evaluate cloud storage solutions
and how to pick provider that is right for their needs?”.
The rest of the paper is structured as follows: Section 2 makes an overview of
the related work in the area; Section 3 presents the methodological framework of
our approach to testing of cloud storage providers; Section 4 describes the specif-
ics of the testing environment and experiments, we have made; Section 5 presents
and analyse the results and finally, section 6 concludes the paper.
2. Related work
There exist a number of research works that directly relate to our and aim at
performance comparison of cloud storage providers.
Like [5], where a comparison between Google Cloud Service and iCloud is
made by exploration of the features of these two cloud storage services.
In [6], the authors have tested performance of several cloud storage provid-
ers including Google Drive and Dropbox and have analysed their applicability in
healthcare services by using medical image files for testing and comparison. Like
what is shown in this paper, a comparison was based on time duration of several
commands, including upload, download, and file deletion.
Another comparison of some popular cloud storage services is provided in
[7]. The authors aim to help users to choose the right cloud service for storage by
making a comparison on 10 different factors, including performance. It is evalu-
ated by upload and download of files of two different sizes.
There also exist several non-academic surveys [3], [4] that try to rate cloud
storage providers, however they do not focus on methodological approach to test-
ing but rather just compare the properties of different plans that cloud storage
providers offer.
Another direction of research that have some relation to our work concern
performance testing of various cloud services. Like in [8], where High-Perfor-
64
mance Computing (HPC) is evaluated in terms of performance comparison of
Google services Cloud Functions (Function-as-a-service) and Compute Engine
(Infrastructure-as-a-Service).
In conclusion, there exist a lot of work in comparison of cloud services and
cloud storage in particular. However, in this paper we are trying to fill the gap in
relation to the cloud evaluation with respect to data-intensive systems. For this
purpose, in next section we present our methodology for testing performance,
which is specifically targeted at storage service operations.
3. Comparison methodology
This section will explain the methodological approach for comparison be-
tween different cloud storage providers.
The test environment should be fully isolated from other applications, in
order to prevent data interference. An additional application is also needed to
provide a bridge between the test environment and cloud providers under test. It
will also serve as a wrapper that will allow access to different cloud providers and
provide the same and fair conditions for all of them.
We will perform the test following three main phases:
1. pre-test phase – a share is created, which is going to be used in the test
phase to check the performance of cloud data storage providers
2. test execution phase – this phase consists of execution of 9 operations
common for each operating system and execution time is measured for each
of them. These operations are the following:
a) Create share – this operation is used to create a location for storing
files;
b) List share – this operation is used to show files in the share\direc-
tory listed;
c) Move share – this operation is used to move a directory and subdi-
rectories (if available) and files within the share;
d) Copy share – this operation is used to copy a directory and subdi-
rectories (if available) and files within the share;
e) Delete share – this operation is used to delete a directory and sub-
directories (if available) and files within the share;
f) Upload file – this operation is used to transfer data from source
(computer\PC) to destination (cloud share in this project case);
g) Download file – this operation is used to transfer data from source
(cloud share) to destination (computer\PC);
h) Copy file – this operation is used to copy files;
i) Delete file – This operation is used to remove a file from the file
system in Create share.
65
3. post-test phase – this phase has the duty to prepare for the next iteration
of the test execution phase. It includes cleaning the test file that that was
created during the previous phase. This is needed since free accounts are
used that have limited storage space.
Figure 1: Cloud providers testing methodology
Testing of a cloud storage provider should be performed while treating it as a
black box. Normally, one should not be able to get any kind of internal informa-
tion for cloud architecture infrastructure as this is considered as security breach
and if that happens the cloud infrastructure could be classified as highly unreli-
able. This way, we are going to use opaque testing technique. With this technique,
only the fundamental aspects of the system are being explored. In that way, more
data may be collected and conclusions can be very accurate regarding different
cloud storage vendor’s behaviour and response according to our setup.
In order to perform the test, we should ensure the following requirements
that are supposed to the fairest test conditions:
1. Single platform or application should be used to access different cloud
storage providers.
2. Virtualization should be used, which is limited to a single virtual ma-
chine. This will provide an isolated environment and is a safe, efficient, cheap
and flexible way to test applications – one can test everything from server
configurations to resource allocation and most importantly for us – storage.
3. The operating system should be less demanding and have good handling
of resources so it can have less interference with the application and the test
results to be believed as accurate as they can.
66
4. It should be considered that cloud storage had different characteristics
for different uses (different end users or companies could make use of the
service in different ways). For this reason, we focused only on file-system
based operations and we will use a single application to access different
cloud solutions for storage service offered by vendors.
4. Building the testing environment
We are going to use Rclone2 command-line tool as an intermediary appli-
cation between a client and cloud provider service. This way the integration is
provided between them. Rclone, is an instrument written in Go programming
language which is used to download\upload data from computer to a cloud hosted
data storage centre. It can connect to various cloud storage centres. This way, the
requirement for a single platform to have access to different cloud storage ser-
vices offered by vendors is going to be fulfilled.
Another objective of using Rclone command-line tool is to produce multiser-
vice cloud delivery model. By developing and implementing it, we can compare
supported storage services from a performance perspective. The architecture of
the test environment built is shown on Figure 2.
Figure 2: Architecture of a multi-cloud storage performance test
To provide virtualization, Oracle VirtualBox is used. It is a deceptively sim-
ple, but powerful and free to use cross-platform virtualization application for x86
hardware, targeted at server, desktop and embedded use [5].
2
https://rclone.org
67
As an operating system the CentOS Linux distribution was used, as it is a sta-
ble, predictable, manageable, and reproducible platform derived from the sources
of Red Hat Enterprise Linux [6], [7]. It is available free of charge and technical
support is primarily provided by the community via official mailing lists, web fo-
rums, and chat rooms. Other reasons for it to be chosen for our work is that it has
good documentation; it is highly customizable and is supported by Virtualbox.
As defined in the methodology description in Section 3, we have to imple-
ment the operations that are most used on storage. In the list below each operation
is shown together with the specific Rclone command that was used to execute it:
• Create share
rclone mkdir [Provider]:Testdir
• List share
rclone lsf [Provider]:Test
• Move share
rclone move [Provider]:Test [Provider]:Testdir
• Copy share
rclone copy [Provider]:Testdir [Provider]:Test
• Delete share
rclone purge [Provider]:Testdir
• Upload file
rclone copy /home/user/documents/1GB.txt
[Provider]:Test
• Download file
rclone sync [Provider]:Test /home/user/downloads
• Copy file
rclone copy [Provider]:Test/1GB.txt
[Provider]:1GB.txt
• Delete file
rclone delete [Provider]:Test/1GB.txt
5. Experiment results and analysis
This section presents the results of performance comparison of cloud service
providers. After presenting particular and average times for execution of each
command listed in previous section, we also make some analysis of the different
providers based also on their pricing plans. To perform the test a single 1GB file
with randomly generated contents is used.
All times shown in the tables and figures in this section, as well as in the ap-
pendixare in the format minutes:seconds (mm:ss).
68
The experiment described in this section was undertaken under two impor-
tant assumptions:
• We are going to test only free services delivered by cloud storage provid-
ers. This is an important assumption, because given cloud service provider
may limit the resources available to their free tier services, while increasing
or removing the said limit for the paid plans.
• Analysis of pricing plans of cloud storage providers has been made only
about per month plans of each provider and for personal users only. It is
important, because many providers may offer additional services on top of
storage, which may influence the price of storage. Cloud providers also offer
additional subscriptions, like annual ones, family plans, business, and enter-
prise plans, etc., which may vary significantly in terms of pricing.
5.1. Test Results
Results of the tests performed given the environment and methodology, de-
scribed in section 4 are shown in Table 1.
Table 1
Performance results of Cloud storage providers
Average Times (mm:ss)
Operation
Google Drive OneDrive DropBox
Create share 00:01.7 00:02.4 00:01.3
List share 00:00.8 00:01.4 00:00.9
Move share 00:01.3 00:02.5 00:02.0
copy share 00:01.4 00:02.8 00:01.4
delete share 00:01.2 00:01.8 00:01.4
Upload 03:57.4 04:36.2 04:32.7
Download 05:41.3 06:06.8 02:30.1
Copy 00:04.7 00:06.7 00:02.9
Delete 00:02.2 00:03.1 00:01.7
Duration of all tests 09:52.0 11:03.8 07:14.3
As seen from Table 1, the performance of the three compared cloud storage
providers is similar, with slight underperformance of OneDrive in Share/directo-
ry operations (Figure 3). However, performance of all three providers in upload/
download is similar (Figure 4). More detailed table with test results is presented
in Appendix 1.
69
Figure 3: Average time (mm:ss) of share and directory operations
Figure 4: Average time (mm:ss) of upload and download file operations (file
size is 1 GB)
5.2. Pricing plans evaluation
All cloud storage providers have consumer storage plans and support differ-
ent storage plans for business. Here we are going to focus on consumer storage
plans. Please note that all prices refer to individual accounts and they are not op-
tions for businesses. Also depending on the plan every provider gives you bonus
features that are not part of our research.
This research shows that, the pricing plans of tested cloud storage providers
are almost the same. However, it should be noted that Google Drive offers the
70
largest storage space in their free plan. It is also one of the most generous cloud
storage providers with their plans even if the free plan of the storage is shared
between different services that they offer.
Table 2
Cloud storage providers pricing comparison
Google drive OneDrive Dropbox
Storage Price per month Storage Price per month Storage Price per month
15 GB Free 5 GB Free 2 GB Free
30 GB 6$ 100 GB 1.99$ 2 TB 9.99$ starting
2 TB 12$ 1 TB 6.99$ 3 TB 16.58$ starting
5 TB 18$ 6 TB 9.99$
At first glance, Dropbox probably have the best pricing offers for bigger
storage needs and offer the best price per space ratio. However, it should also
be noted that most providers, together with the storage offer users also a large
number of other services as well. This requires a more complex methodology and
criteria for pricing comparison of cloud storage providers.
6. Conclusion
In terms of data-intensive systems, it is worth to be able to evaluate differ-
ent storage options available for small business and individual users. In contem-
porary systems, most data is stored over the cloud using the services of differ-
ent cloud storage providers. This paper presents a methodological framework
to evaluate cloud storage providers in terms of their performance parameters. It
also presents details on specific testing environment and results from testing the
performance of three popular cloud providers that also offer free storage options.
Additionally, a comparison of the pricing plans of these providers is performed;
however, it is difficult to assess them in this respect, as most subscriptions include
other service, besides storage.
It should be noted that a certain drawback of cloud solutions is represented
by bandwidth limitations and the end user network is very important part of the
cloud service. If the network is slow and unstable it may trouble accessing or
sharing files and even, make impossible to work on this kind of environment.
However, investigation on how end user network affect performance of cloud
storage providers is part of our further research.
Directions for future research include:
• Increasing the comparison with more service providers
• Development of methodology for comparison of other quality charac-
teristics of cloud storage providers like reliability, availability, security and
71
cost-effectiveness. It may also appear beneficial to define a compound mea-
sure for cloud storage quality of service, by combining the results of the
various tests of such characteristics.
• The experiment may be expanded to include more diverse tests, for ex-
ample with various file sizes, single transaction with large number of files
(both small and large ones), and etc.
7. Acknowledgements
Research presented in this paper is partially supported by the Sofia Uni-
versity “St. Kliment Ohridski” Research Science Fund project No. 80-10-
145/23.05.2022 – “Data intensive software architectures”.
Authors of the paper are also grateful to the anonymous reviewers for their
valuable comments and remarks, which helped to increase the quality of the paper.
8. References
[1] Kleppmann, M. (2017). Designing Data-Intensive Applications. O’Reilly.
Beijing.
[2] Hey, T., Tansley, S., Tolle, K. M. (2009). Jim Gray on eScience: a trans-
formed scientific method. The Fourth Paradigm.
[3] Best cloud storage providers 2021, available at: https://cloudstorageinfo.
org/top-10-cloud-storage-providers.
[4] Cloud Storage Comparison – Table Chart: https://www.goodcloudstorage.
net/cloud-storage-comparison.
[5] H. Arif, H. Hajjdiab, F. A. Harbi and M. Ghazal, “A Comparison between
Google Cloud Service and iCloud,” 2019 IEEE 4th International Confer-
ence on Computer and Communication Systems (ICCCS), 2019, pp. 337-
340, doi: 10.1109/CCOMS.2019.8821744.
[6] Roy, M., Singh, M. (2022). Performance Evaluation and Comparison of
Various Personal Cloud Storage Services for Healthcare Images. In: Tava-
res, J.M.R.S., Dutta, P., Dutta, S., Samanta, D. (eds) Cyber Intelligence and
Information Retrieval. Lecture Notes in Networks and Systems, vol 291.
Springer, Singapore.
[7] Zenuni, Xhemal & Ajdari, Jaumin & Ismaili, Florie & Raufi, Bujar. (2014).
Cloud storage providers: A comparison review and evaluation. 883. 272-
277. 10.1145/2659532.2659609.
[8] Malla, S., & Christensen, K. (2020). HPC in the cloud: Performance com-
parison of function as a service (FaaS) vs infrastructure as a service (IaaS).
Internet Technology Letters, 3(1), e137.
72
Appendix: Detailed results of performance comparison
73