Automated Benchmarking of Cloud-Hosted DBMS
With benchANT
Daniel Seybolda , Jörg Domaschkaa
a
    Ulm University, Institute of Information Resource Management Albert-Einstein-Allee 43, 89077, Ulm, Germany

Keywords
DBMS, Cloud, Performance, Scalability, Benchmarking-as-a-Service


   Driven by the data-intensive applications of Web 2.0, Big Data and Internet of Things, Database
Management Systems (DBMSs) and their operation have significantly changed over the last
decade. Besides relational DBMSs, manifold NoSQL [1, 2] and NewSQL [3, 2] DBMSs evolved,
promising a set of non-functional features that are key requirements for each data-intensive
application: high performance, horizontal scalability, elasticity and high-availability [4].
   In order to take full advantage of these non-functional features, the operation of DBMSs is
moving towards elastic infrastructures such as the cloud. Cloud computing enables scalability
and elasticity on the resource level. Therefore, the storage backend of data-intensive applications
is commonly implemented by distributed DBMSs operated on cloud resources [5].
   Yet, the sheer number of heterogeneous DBMSs, cloud resource offers, and the resulting
number of combinations makes the selection and operation of DBMSs a very challenging
task [6, 7]. Therefore, supportive evaluation of the non-functional DBMS features are essential.
Here, the design and execution that analyses is a complex process that involves detailed domain
knowledge of multiple domains [8, 9, 10]. First, the multitude of DBMSs technologies with their
respective runtime parameters needs to be considered. Secondly, the tremendous number of
resource offers including their volatile characteristics needs to be taken into account. Thirdly,
the application-specific workload has to be created by suitable DBMS benchmarks. While
supportive DBMS benchmarks only focus on DBMS performance, the evaluation design and
execution for advanced non-functional features such as scalability, elasticity and availability is
even more challenging [11].
   In order to address these challenges, we present the novel Benchmarking-as-a-Service (BaaS)
platform benchANT 1 that fully automates the benchmarking process of cloud-hosted DBMS.
benchANT is a spin-off of the Ulm University and consequently build on our latest research
results in cloud and DBMS performance engineering [12]. In particular, our research results
define a supportive evaluation methodology consisting of: (i) domain-specific impact factors
for designing comprehensive DBMS evaluations; (ii) a set of evaluation principles to ensure
significant results. Moreover, our methodology emphasizes reproducible evaluation processes

SSP’21: Symposium on Software Performance, November 09–10, 2021, Leipzig, Germany
Envelope-Open daniel.seybold@uni-ulm.de (D. Seybold); joerg.domaschka@uni-ulm.de (J. Domaschka)
GLOBE https://www.uni-ulm.de/in/omi/institut/persons/daniel-seybold/ (D. Seybold);
https://www.uni-ulm.de/in/omi/institut/persons/jd/ (J. Domaschka)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings           CEUR Workshop Proceedings (CEUR-WS.org)
                  http://ceur-ws.org
                  ISSN 1613-0073


                  1
                      https://benchant.com/
for the non-functional features performance, scalability, elasticity and availability.
   On a technical level, the benchANT platform builds upon our research prototype Mowgli [10],
Kaa [13] and King Louie [14]. Mowgli provides a novel DBMS evaluation framework, supporting
the design and automated execution of performance and scalability evaluation processes. Mowgli
manages cloud resources, DBMS deployment, workload execution and result processing based
on evaluation scenarios, which expose configurable domain-specific parameters. The Kaa
framework [13] automates the DBMS elasticity evaluation process by enabling DBMS and
workload adaptations. The King Louie framework [14] builds upon these features and enables
availability evaluations by providing an extensive failure injection framework.
   benchANT lifts these research results into an enterprise grade BaaS platform with an easy-
to-use benchmark configurator. In its current state, benchANT enables the configuration and
automated execution from 7 major RDBMS, NoSQL and NewSQL DBMS2 with DBMS-specific
runtime configurations such as cluster size, replication factor or consistency settings; 4 public
cloud providers3 with over 700 different VM flavours and 1 benchmark4 with 5 workload
configurations. The evaluation results are automatically processed by the benchANT platform
and presented in a result dashboard. The result processing uses the raw DBMS benchmark
metrics throughput and latency to generate higher-level metrics such as the scalability factor [10]
or unified metrics over the dimensions performance, costs and availability [15].
   In this talk, we provide an overview how these research results are incorporated into the novel
BaaS concept and demonstrate in a live walk-through how benchANT supports practitioners
and researchers to address performance challenges such as:
    • Which cloud provider and which VM flavour provides the best performance/cost ratio for
      a 3 node MongoDB cluster?
    • Will new DBMS releases always increase the the performance?
    • Is there a significant throughput and latency difference between MongoDB, Cassandra
      and Couchbase for an IoT workload?


References
 [1] A. Davoudian, L. Chen, M. Liu, A survey on nosql stores, ACM Comput. Surv. 51 (2018).
     doi:1 0 . 1 1 4 5 / 3 1 5 8 6 6 1 .
 [2] S. Mazumdar, D. Seybold, K. Kritikos, Y. Verginadis, A survey on data storage and placement
     methodologies for cloud-big data ecosystem, Journal of Big Data 6 (2019) 15. doi:1 0 . 1 1 8 6 /
     s40537- 019- 0178- 3.
 [3] K. Grolinger, W. A. Higashino, A. Tiwari, M. A. Capretz, Data management in cloud
     environments: Nosql and newsql data stores, Journal of Cloud Computing: Advances,
     Systems and Applications 2 (2013) 22. doi:1 0 . 1 1 8 6 / 2 1 9 2 - 1 1 3 X - 2 - 2 2 .
 [4] D. Abadi, R. Agrawal, A. Ailamaki, M. Balazinska, P. A. Bernstein, M. J. Carey, S. Chaudhuri,
     J. Dean, A. Doan, M. J. Franklin, J. Gehrke, L. M. Haas, A. Y. Halevy, J. M. Hellerstein, Y. E.
     Ioannidis, H. V. Jagadish, D. Kossmann, S. Madden, S. Mehrotra, T. Milo, J. F. Naughton,
   2
     MySQL, PostgreSQL, ArangoDB, Apache Cassandra, Couchbase, MongoDB and CockroachDB
   3
     AWS, Azure, IONOS and Telekom
   4
     Yahoo Cloud Serving Benchmark (YCSB)
     R. Ramakrishnan, V. Markl, C. Olston, B. C. Ooi, C. Ré, D. Suciu, M. Stonebraker, T. Walter,
     J. Widom, The beckman report on database research, Commun. ACM 59 (2016) 92–99.
     URL: https://doi.org/10.1145/2845915. doi:1 0 . 1 1 4 5 / 2 8 4 5 9 1 5 .
 [5] D. Abadi, A. Ailamaki, D. Andersen, P. Bailis, M. Balazinska, P. Bernstein, P. Boncz,
     S. Chaudhuri, A. Cheung, A. Doan, et al., The seattle report on database research, SIGMOD
     Rec. 48 (2020) 44–53. doi:1 0 . 1 1 4 5 / 3 3 8 5 6 5 8 . 3 3 8 5 6 6 8 .
 [6] S. Sakr, Cloud-hosted databases: technologies, challenges and opportunities, Cluster
     Computing 17 (2014) 487–502. doi:1 0 . 1 0 0 7 / s 1 0 5 8 6 - 0 1 3 - 0 2 9 0 - 7 .
 [7] M. Stonebraker, A. Pavlo, R. Taft, M. L. Brodie, Enterprise database applications and the
     cloud: A difficult road ahead, in: 2014 IEEE International Conference on Cloud Engineering,
     IEEE, 2014, pp. 1–6. doi:1 0 . 1 1 0 9 / I C 2 E . 2 0 1 4 . 9 7 .
 [8] D. Seybold, Towards a framework for orchestrated distributed database evaluation in
     the cloud, in: Proceedings of the 18th Doctoral Symposium of the 18th International
     Middleware Conference, Middleware ’17, ACM, New York, NY, USA, 2017, pp. 13–14.
     doi:1 0 . 1 1 4 5 / 3 1 5 2 6 8 8 . 3 1 5 2 6 9 3 .
 [9] J. Domaschka, D. Seybold, Towards understanding the performance of distributed database
     management systems in volatile environments, in: Symposium on Software Perfor-
     mance, volume 39, Gesellschaft für Informatik, 2019, pp. 11–13. URL: https://pi.informatik.
     uni-siegen.de/stt/39_4/01_Fachgruppenberichte/SSP2019/SSP2019_Domaschka.pdf.
[10] D. Seybold, M. Keppler, D. Gründler, J. Domaschka, Mowgli: Finding your way in the dbms
     jungle, in: Proceedings of the 2019 ACM/SPEC International Conference on Performance
     Engineering, ICPE ’19, ACM, New York, NY, USA, 2019, pp. 321–332. doi:1 0 . 1 1 4 5 / 3 2 9 7 6 6 3 .
     3310303.
[11] D. Seybold, J. Domaschka, Is distributed database evaluation cloud-ready?, in: European
     Conference on Advances in Databases and Information Systems (ADBIS) - New Trends
     in Databases and Information Systems (Short Papers), Springer International Publishing,
     Cham, 2017, pp. 100–108. doi:1 0 . 1 0 0 7 / 9 7 8 - 3 - 3 1 9 - 6 7 1 6 2 - 8 _ 1 2 .
[12] D. Seybold, An automation-based approach for reproducible evaluations of distributed
     DBMS on elastic infrastructures, Ph.D. thesis, 2021. URL: https://oparu.uni-ulm.de/xmlui/
     handle/123456789/37430. doi:1 0 . 1 8 7 2 5 / O P A R U - 3 7 3 6 8 .
[13] D. Seybold, S. Volpert, S. Wesner, A. Bauer, N. Herbst, J. Domaschka, Kaa: Evaluating elas-
     ticity of cloud-hosted dbms, in: 2019 IEEE International Conference on Cloud Computing
     Technology and Science (CloudCom), 2019, pp. 54–61. doi:1 0 . 1 1 0 9 / C l o u d C o m . 2 0 1 9 . 0 0 0 2 0 .
[14] D. Seybold, S. Wesner, J. Domaschka, King louie: Reproducible availability benchmarking
     of cloud-hosted dbms, in: 35th ACM/SIGAPP Symposium on Applied Computing (SAC
     ’20), March 30-April 3, 2020, Brno, Czech Republic, 2020, pp. 144–153. doi:1 0 . 1 1 4 5 / 3 3 4 1 1 0 5 .
     3373968.
[15] J. Domaschka, S. Volpert, D. Seybold, Hathi: An mcdm-based approach to capacity planning
     for cloud-hosted dbms, in: 2020 IEEE/ACM 13th International Conference on Utility and
     Cloud Computing (UCC), 2020, pp. 143–154. doi:1 0 . 1 1 0 9 / U C C 4 8 9 8 0 . 2 0 2 0 . 0 0 0 3 3 .