<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshop on Cloud Technologies in Education, December</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Cloud enabling educational platforms with corc</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rasmus Munk</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Marchant</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brian Vinter</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Aarhus University</institution>
          ,
          <addr-line>Ny Munkegade 120, Aarhus C, 8000</addr-line>
          ,
          <country country="DK">Denmark</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Niels Bohr Institute</institution>
          ,
          <addr-line>Blegdamsvej 17, Copenhagen, 2100</addr-line>
          ,
          <country country="DK">Denmark</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>18</volume>
      <issue>2020</issue>
      <fpage>04</fpage>
      <lpage>4</lpage>
      <abstract>
        <p>In this paper, it is shown how teaching platforms at educational institutions can utilize cloud platforms to scale a particular service, or gain access to compute instances with accelerator capability such as GPUs. Specifically at the University of Copenhagen (UCPH), it is demonstrated how the internal JupyterHub service, named Data Analysis Gateway (DAG), could utilize compute resources in the Oracle Cloud Infrastructure (OCI). This is achieved by utilizing the introduced Cloud Orchestrator (corc) framework, in conjunction with the novel JupyterHub spawner named MultipleSpawner. Through this combination, we are able to dynamically orchestrate, authenticate, configure, and access interactive Jupyter Notebooks in the OCI with user defined hardware capabilities. These capabilities include settings such as the minimum amount of CPU cores, memory and GPUs the particular orchestrated resources must have. This enables teachers and students at educational institutions such as UCPH to gain easy access to the required capabilities for a particular course. In addition, we lay out how this groundwork, will enable us to establish a Grid of Clouds between multiple trusted institutions. This enables the exchange of surplus computational resources that could be employed across their organisational boundaries.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;teaching</kwd>
        <kwd>cloud computing</kwd>
        <kwd>grid of clouds</kwd>
        <kwd>Jupyter Notebook</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The availability of required computational resources in organisations, such as scientific or
educational institutions, is a crucial aspect of delivering the best scientific research and teaching.
When teaching courses involving data analysis techniques it can be beneficial to have access to
specialized platforms, such as GPU accelerated architectures.</p>
      <p>At higher educational institutions, such as the University of Copenhagen (UCPH) or Lund
University (LU), these centers are substantial investments, that are continuously maintained
and upgraded. However, the usage of these resources often varies wildly between being fully
utilized to sitting idly by.</p>
      <p>We therefore propose, that these institutional resources be made available (with varying
priority) across trusted educational and scientific organisations. Foremost, this is to enable the
voluntary sharing of underused resources to other institutions, thereby potential establishing
greater scalability than can be found within each individual institution.</p>
      <sec id="sec-1-1">
        <title>1.1. Basic IT</title>
        <p>Within institutions such as UCPH, there is a mixture of services that each provides. At the
very basic level, there are infrastructure services such as networking, account management,
email, video conferencing, payroll management, license management, as well OS and software
provisioning. In this paper, we define these as Basic IT services. At educational institutions,
additional services can be added to this list, these include services for handling student
enrollment, submissions, grading, course management, and forum discussions. As with the initial
Basic IT services, these are typically of the shelf products that needs to be procured, installed,
configured and maintained on a continuous basis.</p>
        <p>A distinguishing trait of Basic IT services, in an education context, is that they are very
predictable in terms of the load they will exhibit, both in times of high and low demand. For
instance, there will be busy junctions, such as assignment hand in days, release of grades, student
enrollment, and so on. In contrast, holiday and inter-semester periods will likely experience
minor to no usage. Given this, these services are classic examples of what cloud computing was
developed to provide. Eficient utilization of on-demand resources, with high availability and
scalability to handle fluctuating usage in a cost efective manner.</p>
      </sec>
      <sec id="sec-1-2">
        <title>1.2. Science IT</title>
        <p>Science IT services, in contrast, revolve around the institutions scientific activities whether
by researchers or students. They include services such as management, sharing, transferring,
archiving, publishing, and processing of data, in order to facilitate the scientific process. In
addition, these facilities also enable lecturers to utilize their research material in courses, giving
students access to the same platform and resources.</p>
        <p>
          What distinguishes these services, is that they impose diferent constraints compared to Basic
IT services. These typically involve areas such as, computational load, security, budgetary,
scientific, and legal requirements, among others. For example, it is often too ineficient, or costly
to utilize public cloud resources for the storing and processing of large scientific datasets at the
petabyte scale. In this case, a more traditional approach such as institutional compute resources
is required [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>
          Research fields such as climate science [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], oceanography [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], and astronomy [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], often employ
experimental simulations as a common scientific tool. These simulations produce output up to
petabytes in size, that still need to be stored for subsequent postprocessing and analysis. Upon a
scientific discovery from this process, the resulting datasets needs to be archived in accordance
with regulatory requirements, which in the case of UCPH is 5 years [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] (only available in
Danish).
        </p>
      </sec>
      <sec id="sec-1-3">
        <title>1.3. Institutional resources</title>
        <p>
          High Performance Computing (HPC) and regular compute centers are often established at
higher educational institutions to provide Science IT services. The UCPH [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], University of
Antwerp [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], and LU [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] compute centers are examples of this. In addition, institutions can also
gain access to similar resources through joint facilities like the Vienna Scientific Cluster [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ],
which supports 19 institutions, 10 of which are higher educational institutions. Finally there
are national and pan-national resources such as ARCHER2 (UK) [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] or the EuroHPC [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] that
review applications before access is granted.
        </p>
        <p>These established centers are very expensive to build and have a limited lifespan before they
need to be replaced. Even smaller educational compute platforms follow a similar life-cycle. For
instance, at the UCPH a typical machine has a lifetime of 5 years before it needs to be replaced.
This is whether the machine has been heavily utilized or not. Therefore, it is important that
these systems across institutions are utilized, not only eficiently, but at maximum capacity
throughout their lifetime.</p>
        <p>
          For organising the sharing of resources across trusted educational and scientific organisations,
inspiration is drawn from the way traditional computational Grids have been established [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
The diference is, that instead of establishing a Grid where individual resources are attached,
this model will instead be based on each institution establishing a Cloud of resources that are
shared via a Grid. This means that the Grid is responsible for interconnecting disjointed clouds,
whether they be institutional or public cloud platforms. The result being an established model
for sharing cloud resources across educational institutions in support of cloud services for
bachelor and master courses, general workshops, seminars and scientific research.
        </p>
        <p>
          In this paper, we present how an existing teaching and research service at UCPH could be
enabled with access to a cloud framework, which is the first step towards a Grid of Clouds
resources. We accomplish this by using the Cloud Orchestrator (corc) framework [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Through
this, we are able to empower the DAG service with previously inaccessible compute resources
across every course at UCPH. This was previously not feasible with internal resources alone.
Since we do not have access to other institutional resources at this point in time, we utilized a
public cloud provider to scale the service with external resources.
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>
        At the Niels Bohr Institute (NBI), part of UCPH, we host a number of Science IT services that
are part of providing a holistic educational platform for researchers, teachers, students, and
general staf. A subset of these Science IT services have been especially beneficial across all
levels of teaching. Namely, services such as the University Learning Management System (LMS),
called Absalon, which is based on Canvas [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] for submissions and grading. The Electronic
Research Data Archive (ERDA) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] for data management and sharing tasks. In addition to the
Data Analysis Gateway (DAG) [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], which is a JupyterHub powered platform for interactive
programming and data processing in preconfigured environments.
      </p>
      <sec id="sec-2-1">
        <title>2.1. Teaching platforms</title>
        <p>The combination of these subset services, in particular the combination of ERDA and DAG,
has been especially successful. Teachers have used these to distribute course material through
ERDA, which made the materials available for students to work on at the outset of the course.
This ensures that students can get on with the actual learning outcomes from the get go, and
not spend time on tedious tasks such as installing prerequisite software for a particular course.
Due to budgetary limitations, we have only been able to host the DAG service with standard
servers, that don’t give access to any accelerated architectures.</p>
        <p>
          Across education institutions, courses in general have varying requirements in terms of
computing resources, environments, and data management, as defined by the learning outcomes
of the course. The requirements from computer science, data analysis, and physics oriented
courses are many, and often involve specialized compute platforms. For example, novel data
analysis techniques, such as Machine Learning or Deep Learning have been employed across a
wide range of scientific fields. What is distinct about these techniques is the importance of the
underlying compute platform on which it is being executed. Parallel architectures such as GPUs
in particular are beneficial in this regard, specifically since the amount of independent linear
systems that typically needs to be calculated to give adequate and reliably answers are immense.
The inherent independence of these calculations, makes them suitable for being performed in
parallel, making it hugely beneficial to utilize GPUs [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>
          Given that the DAG service was an established service at UCPH for data analysing and
programming in teaching bachelor and master students, it seemed the ideal candidate to
enable with access to cloud resources with accelerator technology. For instance, courses such
as Introduction to Computing for Physicists (abbreviated to DATF in Danish) [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], Applied
Statistics: From Data to Results (APPSTAT) [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], and High Performance Parallel Computing
(HPPC) [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], all would benefit from having access to GPU accelerators to solve several of the
practical exercises and hand-in assignments.
2.2. ERDA
ERDA provides a web based data management platform across UCPH with a primary focus on
the Faculty of Science. Its primary role is to be a data repository for all employees and students
across UCPH. Through a simple web UI powered by a combination of an Apache webserver and
a Python based backend, users are able to either interact with the diferent services through
its navigation menu, or a user’s individual files and folders via its file manager. An example
of the interface can be seen in figure 1. The platform itself is a UCPH-specific version of the
open source Minimum Intrusion Grid (MiG) [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], that provides multiple data management
functionalities. These functionalities includes easy and secure upload of datasets, simple access
mechanisms through a web file manager, and the ability to establish collaboration and data
sharing between users through Workgroups.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. Jupyter</title>
        <p>
          Project Jupyter [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] develops a variety of open source tools. These tools aim at supporting
interactive data science, and scientific computing in general. The foundation of these is the
IPython Notebook (.ipynb) format (evolved out of the IPython Project [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]). This format is
based on interpreting special segments of a JSON document as source code, which can be
executed by a custom programming language runtime environment, also known as a kernel.
The JupyterLab [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] interface (as shown in figure 2) is the standard web interface for interacting
with the underlying notebooks. JupyterHub [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] is the de-facto standard to enable multiple
makes them suitable for being performed in parallel, making it hugely beneficial to utilize GPUs. [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
users to utilize the same compute resources for individual Jupyter Notebook/Lab sessions. It
Given that the DAG service was an established service at UCPH for data analysing and
programdoes this through its own web interface gateway and backend database, to segment and register
ming in teaching bachelor and master students, it seemed the ideal candidate to enable with access to
individual users before allowing them to start a Jupyter session.
cloud resources with accelerator technology. For instance, courses such as Introduction to Computing
        </p>
        <p>
          In addition, JupyterHub allows for the extension of both custom Spawners and Authenticators,
for Physicists (abbreviated to DATF in Danish) [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], Applied Statistics: From Data to Results
(APPenabling 3rd party implementations. The Authenticator is in charge of validating that a particular
STAT) [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], and High Performance Parallel Computing (HPPC) [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], all would benefit from having
request is from an authentic user. The responsibility of the Spawner is how a Jupyter session
access to GPU accelerators to solve several of the practical exercises and hand-in assignments.
is to be scheduled on a resource. Currently there exist only static Spawners that utilize either
preconfigured resources that have been deployed via Batch, or Container Spawners, or at
2.2. ERDA
        </p>
        <p>
          selective cloud providers such as AWS [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]. As an exception to this, the WrapSpawner [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]
FacbueltcyhoafnSgceidenacfet.erIttshperiJmupayryterroHleubis steorbveicae disatlaaurenpcohseidto,rmyafokrinagll ietmi mplpooyseseisbalendtostduydneanmtsicaacrlloyss
UCcPhHan.Tgehrtohueghseat osifmspulpepwoerbteUdIrpesoowuerrceeds baynda cpormovbiidneartsio.nTohfearenfoArpeaicthwewouelbdsebreveorfabnednaefitPiyftahon
basSepdabwanckerenedx,teunsedresdatrheeabWlertaopeSipthawernienrt’esraecxtiswtiinthgtchaepdaibfeirleitnietssewr vitihcetshtehraobuilgithyittsondayvnigaamtiiocnalmlyeandud,
or oarurseemr’sovinedpivroidvuidalefilressaannddrfeosloduerrscevsi.a its file manager. An example of the interface can be seen in
(MiG) [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], that provides multiple data management functionalities. These functionalities includes
easy and secure upload of datasets, simple access mechanisms through a web file manager, and the
ability to establish collaboration and data sharing between users through Workgroups.
[
          <xref ref-type="bibr" rid="ref36">36</xref>
          ]. All of these online options, have the following in common. They all have free tier plans available
wit3h.ceRrtaeinlahtaerddwawre oanrdkusage limitations. All are run entirely in the web browser and don’t require
anything to be installed locally. At most they require a valid account to get started. Each of them
preAsesnptraesJeunptyetderinN[o2te8b],oWokeobr-bNaosetedbloeoakrnliiknegibnyteruftaicliez,inwghicclhouadlloswersvfiocresboanthdepxlpaotfrot ramnds aimspoart off
Nottheebocoukrsricinultuhme sistannodtarodnlfyorfmeaasti.blAe,nbuovteardvvieiswabolef.aInsupbasretticouflatrh,ewshuepnp oitrtceodmfesattuorecsouarnsdesuwsaigthe
limpirtsogarcaromsmstihnegseacptliavtiftoiersmfsorcasntubdeensetes,neidnuTcabtiloen1a,lainndsttihtuetiirohnasrsdhwoaurledceanpaabblielitaicecseisnsTtoabilnen2o.vFartoivme
looWkienbg-baatstehde fteeacthunreoslo,egaiecsh tphraotvsidueprpiosrftasirtlhyesiirml eilaarrniinntge.rTmhseosfeeinnacbl ulidneg iLnatnegraucatgivese, pCroolglarbaomramtiinngg,,
andveNrsaitoivnecPoenrtsriostleanncde (aiu.et.otmheataebdilpitryotgorakmeempidnagtaasasfetesrs mtheenstesstsoioennhsuasreenindsetda)n. tHfoeewdebvaecr,kt.here is a
noticeable diference, in the maximum time (MaxTime) that each provider allows a given session to
be 3in.1ac.tive before it is pstrooppgerda.mWmithinCgoCpaolcrbtaeilnsg the most generous, allowing 24 hours of activity
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Interactive</title>
        <p>before termination. In contrast, internal hosted services such as DAG allow for the institution to define
thiRsepsoelaicryc.hAint UclCouPHd,cowme phuatviengdeffinoerdedthuicsattoiobnet2yphiocuarllsyorfeivnoalcvteivsiatyr,oaunndd
aunsiunngliWmeitbe-denamaboluedntSoofftactwivaerteimaseafoSrearnviicnedi(vSiadauSa)laspespsliocnat.iHonosw. eEvxearm,apslTeasbolef s2uschhowinsc,lwudeecuprlaretfnotrlymdsosnu’cthpraosvGidiet HanuybG[2P9U],
capGaoboilgitlye, Dwohcicsh[ 3is0s],oGmoeothgilnegCtohlaatbcoorualtdorbye [c3h1a]n,gKeadgtghlreo[u3g2h],tahnedutBiliinsdateiron[3o3f].anEaecxhteornfathlcelsoeucdawn ifiltlh
GPaUppaortwiceurleadr cnoimchpeuitne arecsoouurrsceesa.t the teacher’s or student’s discretion. Nevertheless, the provided</p>
        <p>
          Gcaivpeanbitlhitiys,otfhteenDdAoGessecrovmicee wseiethmeitds aoswtnheb uidredaelncsa,ninditdhaatet tthoeeamdpmowineisrtwraittihonexotfertnhael scelorvuidcerei-s
souorfcteesn. lBeoftthtobethcaeutseeacithpinrogvtiedaemssirmesiplaornfseiabtluerefosratshtehecopuurbsleic. cTlhouisdrpersopvoindseirbsiliintytetrympsicoafllLyainngculuagdeess
andesCtaoblllaisbhoirnagtesatubdileitnyt, bacucteaslss,ocsoiunrcseeitmisatienrtieaglrdaitsetdridbiurteicotnlytwoitthheUsCpePcHifics pdlaattafomrman,aggueimdeesntonsehrvoiwce.
Provider
Binder[
          <xref ref-type="bibr" rid="ref37">37</xref>
          ]
        </p>
        <p>None
to get started with the service and solving eventual problems related to the service throughout
the course. In addition, many of the external cloud services that ofer free usage, often have
certain limitations, such as how much instance utilisation a given user can consume in a given
time span. Instead, providing such functionalities as Science IT services, could reduce these
overheads and enable seamless integration into the courses. Furthermore, existing resources
could be used to serve the service by scaling through an established Grid of Clouds.</p>
        <p>
          In terms of existing public cloud platforms that can provide Jupyter Notebook experiences,
DAG is similar to Google Colaboratory, Binder, Kaggle, Azure Notebooks [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ], CoCalc [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ], and
Datalore [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ]. All of these online options, have the following in common. They all have free
tier plans available with certain hardware and usage limitations. All are run entirely in the
web browser and don’t require anything to be installed locally. At most they require a valid
account to get started. Each of them present a Jupyter Notebook or Notebook like interface,
which allows for both export and import of Notebooks in the standard format. An overview
of a subset of the supported features and usage limits across these platforms can be seen in
Table 1, and their hardware capabilities in Table 2. From looking at the features, each provider
is fairly similar in terms of enabling Languages, Collaborating, and Native Persistence (i.e. the
ability to keep data after the session has ended). However, there is a noticeable diference, in the
maximum time (MaxTime) that each provider allows a given session to be inactive before it is
stopped. With CoCalc being the most generous, allowing 24 hours of activity before termination.
In contrast, internal hosted services such as DAG allow for the institution to define this policy.
At UCPH, we have defined this to be 2 hours of inactivity, and an unlimited amount of active
time for an individual session. However, as Table 2 shows, we currently don’t provide any GPU
capability, which is something that could be changed through the utilisation of an external
cloud with GPU powered compute resources.
        </p>
        <p>Given this, the DAG service seemed as the ideal candidate to empower with external cloud
resources. Both because it provides similar features as the public cloud providers in terms of
Languages and Collaborate ability, but also since it is integrated directly with UCPHs data
management service.
GPU or TPU
(thresholded
access)
GPU (Pay)
None
None
None
maintenance. For instance TerraFrom is a tool that focuses on infrastructure deployment whereas</p>
      </sec>
      <sec id="sec-2-4">
        <title>3.2. Cloud Orchestration</title>
        <p>
          Puppet, Chef and Ansible are primarily concerned with configuration and maintenance of existing
acnloducdosoyrsdteimna. tFeocroinmstpauntceer, Osyrsatcelempsro[4v5id]e.sTthhreoOurgahcleorCclhoeusdtrInaftrioasnt,ruacntuorregCanLIis[a5t1i]otnooolrthinadticvainduina-l
teract with their infrastructure. The same applies to the Amazon AWS CLI [
          <xref ref-type="bibr" rid="ref52">52</xref>
          ], in addition to a vast
is able to establish a complex infrastructure through a well defined workflow. For instance,
complement of tool-kits [
          <xref ref-type="bibr" rid="ref53">53</xref>
          ] that provide many diferent AWS functionalities including orchestration.
the successful creation of a compute node involves the processing of a series of complex
In contrast, commercial cloud provided tools are often limited to only support the publishing cloud
tasks that all must succeed. An example of such a workflow can be seen in figure 4. Here
vendor and do not ofer cross-cloud compatibility, or the ability to utilize multiple cloud providers
a valid Image, Shape, Location and Network has to be discovered, selected, and successfully
interchangeably.
utilCi zloeuddtoorgcehtehsetrraitnioonrddeevrefloorp mtheenctslofourdthcoe mscpieuntteificncoodmemtounbietye, setsapbelcisiahlelydt.hAonseIamima ginegistothperotvairdgeet
ocproesrsa-tcilnogudsydesptelomymaenndtsd, ihsatrviebmutoiostnly, fboereninbsatsaendcoenUubtiulinzitnug2o0n.0p4reLmTiSse. AcloSuhdaIpaaeSispltahtfeorpmhsyssuiccahl
caosnOfigpuernaStitoancko[f5t4h]eanndodOep,etynpNiecbaulllya i[n55v]o.lvDienvgeltohpemaemnotsuhnatvoeffCocPuUsecdoornesp,rmoveimdionrgyhainghdeprolateynertsiaol f
aacbcsetrlearcatitoonrsto.LeoxpcaotsieoanciosmtympoincaAllPyIsththeapthalyloswicaflorlotchaetiinotnerocfhawnhgeraebltehuesaregseooufrtchee iusntdoerbleyicnrgesautepd-.
naemtwpleosrkofctohnisfig uinrcaltuidoen,cilonucdlupdrionjgecwtshsiucchhSausbInNeDt,IGGOa-tcelwouady,[5a6n]d[5IP7]a,dAdgrreosDsAtThe[5c8o]mapndutOe cncoudpues
s[h5o8]u.ldThuetisleizfer.amInetwhoerckos,nnteoxnteothfealefsesddeoranteodt anlelotwwoforkr tlhikeeuatiGlizraidti,otnheofocrocmhemsterractiiaolnorwpouublldicidceloaulldy
platforms, since they rely on the utilization of organisationally defined clouds that are traditionally
involve the automated provisioning of the computational resource, the configuration of said
deployed, managed, and hosted by the organisation itself. Although required, if as stated, we are
resource, and ensure that the resource is correctly reachable through a network infrastructure.
to establish a Grid of Clouds which should allow for the inclusion of public and commercial cloud
        </p>
        <p>Multiple projects have been developed that automate development and system administration
platforms. The corc framework was developed and designed to eventually support the scheduling of
cloud resources across both organisations and public cloud providers.
4. The first cloud enabled service</p>
        <p>
          445
To establish a Grid of Cloud resources, we started with enabling the usage of a single public cloud
provider to schedule DAG Notebooks on. Through this we created the foundations for the eventual
Grid structure that would allow the resources to be scheduled across multiple clouds and
organisations.
The corc framework was implemented as a Python package. The package establishes the foundations
for essential functions such as orchestration, computation, configuration, and authentication against
tasks such as maintenance, testing, upgrading, and configuration. These includes packages
such as TerraForm [
          <xref ref-type="bibr" rid="ref46">46</xref>
          ], Puppet [
          <xref ref-type="bibr" rid="ref47">47</xref>
          ], Chef [
          <xref ref-type="bibr" rid="ref48">48</xref>
          ], and Ansible [
          <xref ref-type="bibr" rid="ref49">49</xref>
          ], all of which open source
projects that can be utilized across a range of supported cloud providers. Nevertheless, in terms
of enabling workflows that can provide orchestration capabilities, these tools are limited in that
they typically only focuses on a subset of the orchestration functionalities such as provisioning
and deployment or configuration and maintenance. For instance TerraFrom is a tool that focuses
on infrastructure deployment whereas Puppet, Chef and Ansible are primarily concerned with
configuration and maintenance of existing systems. In contrast commercial cloud providers
typically also provide their own orchestration-like tools and Software Development Kits (SDK)s,
enabling the ability to interact with their respective cloud system. For instance, Oracle provides
the Oracle Cloud Infrastructure CLI [
          <xref ref-type="bibr" rid="ref50">50</xref>
          ] tool that can interact with their infrastructure. The
same applies to the Amazon AWS CLI [
          <xref ref-type="bibr" rid="ref51">51</xref>
          ], in addition to a vast complement of tool-kits [
          <xref ref-type="bibr" rid="ref52">52</xref>
          ] that
provide many diferent AWS functionalities including orchestration. In contrast, commercial
cloud provided tools are often limited to only support the publishing cloud vendor and do not
ofer cross-cloud compatibility, or the ability to utilize multiple cloud providers interchangeably.
        </p>
        <p>
          Cloud orchestration developments for the scientific community, especially those aiming to
provide cross-cloud deployments, have mostly been based on utilizing on premise cloud IaaS
platforms such as OpenStack [
          <xref ref-type="bibr" rid="ref53">53</xref>
          ] and OpenNebula [
          <xref ref-type="bibr" rid="ref54">54</xref>
          ]. Developments have focused on
providing higher layers of abstraction to expose a common APIs that allow for the interchangeable
usage of the underlying supported IaaS platforms. The infrastructure is typically defined in these
frameworks through a Domain Specific Language (DSL) that describes how the infrastructure
should look when orchestrated. Examples of this include cloud projects such as INDIGO-cloud
[
          <xref ref-type="bibr" rid="ref55">55</xref>
          ] [
          <xref ref-type="bibr" rid="ref56">56</xref>
          ], AgroDAT [
          <xref ref-type="bibr" rid="ref57">57</xref>
          ] and Occupus [
          <xref ref-type="bibr" rid="ref57">57</xref>
          ]. These frameworks, nonetheless do not allow for
the utilization of commercial or public cloud platforms, since they rely on the utilization of
organisationally defined clouds that are traditionally deployed, managed, and hosted by the
organisation itself. Although required, if as stated, we are to establish a Grid of Clouds which
should allow for the inclusion of public and commercial cloud platforms. The corc framework
was developed and designed to eventually support the scheduling of cloud resources across
both organisations and public cloud providers.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. The first cloud enabled service</title>
      <p>To establish a Grid of Cloud resources, we started with enabling the usage of a single public
cloud provider to schedule DAG Notebooks on. Through this we created the foundations for the
eventual Grid structure that would allow the resources to be scheduled across multiple clouds
and organisations.
The corc framework was implemented as a Python package. The package establishes the
foundations for essential functions such as orchestration, computation, configuration, and
authentication against supported cloud providers and cloud resources. Overall, corc is a
combination of an Infrastructure as a Service (IaaS) management library, and a computation oriented</p>
      <p>Compute</p>
      <p>Configurer
Orchestrator</p>
      <p>Storage</p>
      <p>Scheduler</p>
      <p>Job</p>
      <p>Authenticator
scheduler. This enables the ability to schedule services on a given orchestrated resource. An
overview of the architecture can be seen in figure 4.1.</p>
      <p>The first provider to be integrated into the framework was the OCI IaaS. This was chosen,
because the UCPH had a preexisting collaboration with Oracle, that enabled the usage of
donated cloud resources for testing and development. As also highlighted, this does not limit
the integration of other cloud providers into the framework, which the framework was designed
for. Furthermore, as explored in section 2.3. A new Spawner, named MultipleSpawner was
introduced, to provide the necessary dynamic selection of cloud providers.</p>
      <p>
        As Figure 4.1 indicates, for each provider that corc supports, an orchestrator for that provider
needs to be defined within corc. In addition, the framework defines three other top level
components, namely Compute, Configurer, and Authenticator. All three are abstract definitions
allowing for specific implementations to support the targeted resources which they apply to. A
service can therefore be enabled with the ability to utilize cloud resources by integrating the
corc components into the service itself. This method is limited to services that are developed
in Python. In addition, corc also defines a Command Line Interface (CLI), that can be used to
interact with the cloud provided resources directly. Details about how the framework and CLI
can be used will not be presented in this paper, but can be found in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>{
}
]</p>
      <p>}
" v i r t u a l _ m a c h i n e " : [
{
" name " : " o r a c l e _ l i n u x _ 7 _ 8 " ,
" p r o v i d e r " : " o c i " ,
" i m a g e " : " O r a c l e L i n u x 7 . 8 "</p>
      <sec id="sec-3-1">
        <title>Listing 1: Spawner Deployment configuration</title>
        <sec id="sec-3-1-1">
          <title>4.2. MultipleSpawner</title>
          <p>
            MultipleSpawner [
            <xref ref-type="bibr" rid="ref58">58</xref>
            ] is a Python package allowing for the selection of dynamic Spawners and
resources. Structurally, it is inspired by the WrapSpawner [
            <xref ref-type="bibr" rid="ref27">27</xref>
            ], through the MultipleSpawner
integrates corc into the Spawner ifself. This enables the JupyterHub service to manage and utilize
cloud resources on a dynamic set of providers. In order to enable the MultipleSpawner to support
these dynamic resources providers, two JSON configuration files needs to be defined. One of
these is shown in listing 1, and defines the specific resource type that should be deployed on
the provider. Currently the MultipleSpawner supports deploying, ‘virtual_machine‘, ‘container‘,
and ‘bare_metal‘ resources. The other configuration file is shown in listing 2. It defines the
template configuration settings that specify which Spawner, Configurer, and Authenticator the
MultipleSpawner should use to spawn, configure and connect to the deployed resource.
" name " : " V i r t u a l M a c h i n e Spawner " ,
" r e s o u r c e _ t y p e " : " v i r t u a l _ m a c h i n e " ,
" p r o v i d e r s " : [ " o c i " ] ,
" spawner " : {
" c l a s s " : " sshspawner . sshspawner . SSHSpawner " ,
" kwargs " : {
" r e m o t e _ h o s t s " : [ " { e n d p o i n t } " ] ,
" r e m o t e _ p o r t " : " 2 2 " ,
" s s h _ k e y f i l e " : " ~ / . c o r c / s s h / i d _ r s a " ,
" remote_port_command " : " / u s r / b i n / python3
/ u s r / l o c a l / b i n / g e t _ p o r t . py "
}
} ,
" c o n f i g u r e r " : {
" c l a s s " : " c o r c . c o n f i g u r e r . A n s i b l e C o n f i g u r e r " ,
" o p t i o n s " : {
" h o s t _ v a r i a b l e s " : {
" a n s i b l e _ u s e r " : " opc " ,
" a n s i b l e _ b e c o m e " : " yes " ,
" a n s i b l e _ b e c o m e _ m e t h o d " : " sudo " ,
" new_username " : " { JUPYTERHUB_USER } "
} ,
" h o s t _ s e t t i n g s " : {
" group " : " compute " ,
" p o r t " : " 2 2 "
} ,
" apply_kwargs " : {
          </p>
          <p>" p l a y b o o k _ p a t h " : " s e t u p _ s s h _ s p a w n e r . yml "
}
} ,
" a u t h e n t i c a t o r " : {
" c l a s s " : " c o r c . a u t h e n t i c a t o r . S S H A u t h e n t i c a t o r " ,
" kwargs " : { " c r e a t e _ c e r t i f i c a t e " : " True " }</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Results</title>
      <sec id="sec-4-1">
        <title>Listing 2: Spawner Template configuration</title>
        <p>By integrating corc into the MultipleSpawner, we enabled the architecture shown in figure 5,
where the DAG service is able to dynamically schedule Jupyter Notebooks across the two
resource providers. As is indicated by figure 5, the UCPH and OCI providers are defined to
orchestrate resources, in this case cloud compute instances, in preparation for scheduling a
requested Notebook. In order to validate that the architecture worked as expected, we setup
a test environment on a separate machine. This machine was configured with a corc and
JupyterHub environment, where OCI was defined as a corc provider and the MultipleSpawner
as the designated JupyterHub Spawner. With this in order, the JupyterHub service was ready to
be launched on the machine.</p>
        <p>
          The MultipleSpawner was configured to use the template and deployment settings defined in
listing 1 and 2. This enables the MultipleSpawner to create Virtual Machine cloud resources at
the OCI. Subsequently, the MultipleSpawner uses the SSHSpawner [
          <xref ref-type="bibr" rid="ref59">59</xref>
          ] created by the National
Energy Research Scientific Computing (NERSC) Center to connect and launch the Notebook
on the orchestrated resource. Prior to this, it uses the corc defined SSHAuthenticator and
AnsibleConfigurer to ensure that the MultipleSpawner can connect to a particular spawned
resource and subsequently configure it with the necessary dependencies.
        </p>
        <p>An example of a such a spawn with the specified requirements can be seen in figure 6. To
validate that this resource had been correctly orchestrated, the corc CLI was utilized to fetch the
current allocated resources on OCI. Listing 3 shows that an instance with 12 oracle CPUs, 72
GB of memory and one NVIDIA P100 GPU had been orchestrated. This reflects the minimum
shape that could be found in the EU-FRANKFURT-1-AD-2 availability domain that met the GPU
requirement.
rasmusmunk$ c o r c o c i o r c h e s t r a t i o n i n s t a n c e l i s t
{
" i n s t a n c e s " : [
{
. . .
" a v a i l a b i l i t y _ d o m a i n " : " l f c b : EU−FRANKFURT−1 −AD− 2 " ,
" d i s p l a y _ n a m e " : " i n s t a n c e 2 0 2 0 1 0 1 8 1 0 3 6 3 8 " ,
" i m a g e _ i d " : " o c i d 1 . i m a g e . o c 1 . eu − f r a n k f u r t . . . . " ,
" s h a p e " : "VM . GPU2 . 1 " ,
" s h a p e _ c o n f i g " : {
. . .
" g p u s " : 1 ,
" m a x _ v n i c _ a t t a c h m e n t s " : 1 2 ,
FiFgiugruere5:5:DDAAGGMMuullttiipplleeSSppaawwnneerrAArcrchhitietcetcutruer,eR, R= R=eRsoeusrocuerce</p>
        <p>
          Building upon t"hmis,eamsiomrpyle_ ibnen_cghbmsa"rk: wa7s2m. 0ad,e to evaluate the gain in getting access to a
compute resource wit"hoac pNuVsID"I:A P11200. 0G,PU. A Notebook with the Tensorflow and Keras quick start
application }[
          <xref ref-type="bibr" rid="ref1 ref6">6,1</xref>
          ] was used to get a rough estimate of how much time would be saved in building a
simpl}e neural network that classifies images. Listing 5, shows the results of running the notebook on
the G]P,U powered compute resource for ten times in a row, and Listing 4 shows the results of running
" s t a t u s " : " s u c c e s s "
        </p>
        <p>( p y t h o n 3 ) j o v y a n @ 5 6 e 3 c 3 0 c 2 a f 6 : ~ / work / c t e _ 2 0 2 0 _ p a p e r / n o t e b o o k s $ \
} &gt; p y t h o n 3 b e g i n n e r . py
Took : 1 9 . 4 7 9 9 0 0 3 6 0L1is0t7in4g232: Running OCI Notebook Instance
Took : 1 2 . 8 5 9 1 2 3 7 0 6 8 1 7 6 2 7</p>
        <p>TAososkh:ow1n3i.n0 4figu7r2e973, 1th8e6J1u8p7yt7e4rH4ub spawn action redirected the Web interface to the hosted
NTooteobko:ok1o3n. 2th9e6c7l7ou6d0 5re6s2o8u9rc6e7s3. Relating this to the mentioned courses at UCPH, this then
enTaobolekd: th1e3st.u0d0e2n3ts6 w32it0h4a9cc5e6s0s 5to5 an interactive programming environment via the JupyterLab
inTteorofakc:e. 1 3 . 1 1 8 3 2 9 0 4 8 1 5 6 7 3 8</p>
        <p>Took :</p>
        <p>Building1 3up.0o6n7t5hi0s8,a9 3si5m9p2le8 3b4en5chmark was made to evaluate the gain in getting access to a
coTmopoukt:e re1s3o.u0r8ce9 2w8it4h6a5 8N4V3ID20IA0 7P100 GPU. A Notebook with the Tensorflow and Keras quick</p>
        <p>
          Took : 1 3 . 1 6 0 0 9 9 5 0 6 3 7 8 1 7 4
start application [
          <xref ref-type="bibr" rid="ref60">60</xref>
          ]
        </p>
        <p>Took : 1 3 . 0 3 2 1 7 8 w40a1s 9u4se7d0 t2o1get a rough estimate of how much time would be saved in
building a simp1l3e.n7e1u5r2al8n5e7t0w6o5rk2 0th0a8t1classifies images. Listing 5, shows the results of running</p>
        <p>A v e r a g e :
the notebook on the GPU powered compute resource for ten times in a row, and listing 4 shows
the results of running the same benchmacrokmopnuatenreexsoisutricnegTDenAsGorrfloewsotuimrcees. As this shows, the</p>
        <p>Listing 5: OCI GPU
GFPrUomvethrissiosinmwplaesboenncahvmearrakgieng24ex,7a msepcloe,nwdes
cfaanstseereotrhaintboytuhteirlizwinogrdthsegMaiunletidploeSnpaavwenreargienaco2m,8bsinpaeteidonupwcitohmcporacr,eudsetorstahree DabAleGtoregseotuarccceesws itthhrooutgha aGsPimU.ple gateway to the expected performance
gains of accelerators like a GPU. Expanding on this, the teachers and students at UCPH will now be
able to request a compute resource with a GPU on demand, thereby gaining simple access to achieving
similar faster runtimes in their exercises and assignments.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Listing 5: OCI GPU compute resource Tensorflow times</title>
        <p>From this simple benchmarking example, we can see that by utilizing the MultipleSpawner
in combination with corc, users are able to get access through a simple gateway to the expected
performance gains of accelerators like a GPU. Expanding on this, the teachers and students at
UCPH will now be able to request a compute resource with a GPU on demand, thereby gaining
simple access to achieving similar faster runtimes in their exercises and assignments.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. Conclusions and Future Work</title>
      <p>In this paper, we presented our work towards establishing a Grid of Clouds that enables
organisations, such as educational institutions to share computational resources amongst
themselves and external collaborators. To accomplish this, we introduced corc as a basic building
block enables the ability to orchestrate, authenticate, configure, and schedule computation on a
set of resources by a supported provider.</p>
      <p>OCI was the first provider we chose to support in corc, foremost because of the existing
collaboration with UCPH and the associated credits that got donated to this project. This
enabled us to utilize said provider to cloud enable part of the DAG service at UCPH. This
was made possible through the introduction of the MultipleSpawner package that utilized
corc to dynamically chose between supported cloud providers. We demonstrated that the
MultipleSpawner was capable of scheduling and stopping orchestrated and configured resources
at OCI via a local researcher’s machine.</p>
      <p>In terms of future work, the next step involves the establishment of a Grid layer on top of the
UCPH and OCI clouds. This Grid layer is planned to enable the establishment of a federated
pool of participating organisations to share their resources. By doing so, we will be able to
dynamically utilize cross organisation resources for services such as DAG, allowing us for
instance to spawn Notebooks across multiple institutions such as other universities. Enabling
the sharing of underused resources across the Grid participants. To accomplish this, corc also
needs to be expanded to support additional providers, foremost through the integration of the
Apache libcloud [61] library which natively supports more than 30 providers, we will allow corc
and subsequently the MultipleSpawner to be utilized across a wide range of cloud providers.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This project has received funding from the European Union’s Horizon 2020 research and
innovation programme under the Marie Skłodowska-Curie grant agreement No 765604. Furthermore,
many thanks is given to Oracle for donating the cloud resources that made this project possible.
tutorials/quickstart/beginner.
[61] The Apache Software Foundation, Apache Libcloud , 2021. URL: https://libcloud.apache.
org.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Kale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gioachin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>March</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Suen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Faraboschi</surname>
          </string-name>
          , R. Kaufmann, D. Milojicic,
          <article-title>The who, what, why, and how of high performance computing in the cloud</article-title>
          ,
          <source>in: 2013 IEEE 5th International Conference on Cloud Computing Technology and Science</source>
          , volume
          <volume>1</volume>
          ,
          <year>2013</year>
          , pp.
          <fpage>306</fpage>
          -
          <lpage>314</lpage>
          . doi:
          <volume>10</volume>
          .1109/CloudCom.
          <year>2013</year>
          .
          <volume>47</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Vinter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bardino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rehr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Birkelund</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. O.</given-names>
            <surname>Larsen</surname>
          </string-name>
          ,
          <article-title>Imaging data management system</article-title>
          ,
          <source>in: Proceedings of the 1st International Workshop on Next Generation of Cloud Architectures</source>
          , CloudNG:17,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2017</year>
          . URL: https://doi.org/10.1145/3068126.3071061. doi:
          <volume>10</volume>
          .1145/3068126.3071061.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Häfner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Jacobsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Eden</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. R. B. Kristensen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Jochum</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Nuterman</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Vinter</surname>
          </string-name>
          ,
          <year>Veros v0</year>
          .1
          <article-title>- a fast and versatile ocean simulator in pure python</article-title>
          ,
          <source>Geoscientific Model Development</source>
          <volume>11</volume>
          (
          <year>2018</year>
          )
          <fpage>3299</fpage>
          -
          <lpage>3312</lpage>
          . URL: https://gmd.copernicus.org/articles/11/3299/2018/. doi:
          <volume>10</volume>
          .5194/gmd-11-
          <fpage>3299</fpage>
          -
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Padoan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Juvela</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Haugbølle</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Nordlund,</surname>
          </string-name>
          <article-title>The origin of massive stars: The inertial-inflow model</article-title>
          ,
          <source>Astrophysical Journal</source>
          <volume>900</volume>
          (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .3847/
          <fpage>1538</fpage>
          -4357/ abaa47.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] University of Copenhagen policy for scientific data,
          <source>Technical Report</source>
          , University of Copenhagen, Copenhagen,
          <year>2014</year>
          . URL: https:// kunet.ku.dk/arbejdsomraader/forskning/data/forskningsdata/Documents/ Underskrevetogendeligversionafpolitikforopbevaringaforskningsdata.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6] University of Copenhagen,
          <source>SCIENCE AI Centre</source>
          ,
          <year>2020</year>
          . URL: https://ai.ku.dk/research/.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7] University of Antwerp,
          <source>High Performance Computing CalcUA</source>
          ,
          <year>2020</year>
          . URL: https://www. uantwerp.be/en/core-facilities/calcua/.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8] Lund University, LUNARC: Lund University Computing Center,
          <year>2020</year>
          . URL: https://www. maxiv.lu.se/users/it-services/lunarc/.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9] VSC - Vienna Scientific Cluster, VSC - Vienna Scientific Cluster,
          <year>2009</year>
          . URL: https://vsc.ac. at//access/.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10] The University of Edinburgh, ARCHER2 on-demand,
          <year>2019</year>
          . URL: https://www.epcc.ed.ac. uk/facilities/demand-computing/
          <year>archer2</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <article-title>Directorate-General for Communications Networks, Content and Technology (European Commission), State of the Union 2020</article-title>
          .
          <article-title>EuroHPC: The European Joint Undertaking on High-Performance Computing</article-title>
          ,
          <source>Technical Report</source>
          ,
          <year>2020</year>
          . URL: https://op.europa.eu/en/ publication-detail/-/publication/df20041-f247
          <string-name>
            <surname>-</surname>
          </string-name>
          11ea
          <string-name>
            <surname>-</surname>
          </string-name>
          991b
          <article-title>-01aa75ed71a1/language-en</article-title>
          .
          <source>doi:10</source>
          .2759/26995.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>I.</given-names>
            <surname>Foster</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Kesselman, High Performance Computing: From Grids and Clouds to Exascale</article-title>
          , volume
          <volume>20</volume>
          of Advances in Parallel Computing, IOS Press,
          <year>2011</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>30</lpage>
          . doi:
          <volume>10</volume>
          .3233/ 978-1-
          <fpage>60750</fpage>
          -803-8-3.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Munk</surname>
          </string-name>
          ,
          <article-title>corc: An open source tool for orchestrating Multi-Cloud resources</article-title>
          and scheduling workloads,
          <year>2021</year>
          . URL: https://github.com/rasmunk/corc.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Instructure</surname>
          </string-name>
          ,
          <string-name>
            <surname>Canvas</surname>
            <given-names>LMS</given-names>
          </string-name>
          ,
          <year>2021</year>
          . URL: https://www.instructure.com/canvas/about.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bardino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rehr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Vinter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Munk</surname>
          </string-name>
          ,
          <string-name>
            <surname>ERDA</surname>
          </string-name>
          ,
          <year>2021</year>
          . URL: https://www.erda.dk.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Rasmus</surname>
            <given-names>Munk</given-names>
          </string-name>
          , jupyter_service,
          <year>2020</year>
          . URL: https://github.com/ucphhpc/jupyter_service.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>G.</given-names>
            <surname>Zaccone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Karim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Menshawy</surname>
          </string-name>
          ,
          <article-title>Deep Learning with TensorFlow</article-title>
          , Packt,
          <year>2017</year>
          , p.
          <fpage>320</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18] University of Copenhagen, Introduction to Computing for Physicists,
          <year>2021</year>
          . URL: https: //kurser.ku.dk/course/nfya06018u/.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19] University of Copenhagen, Applied Statistics: From Data to Results,
          <year>2021</year>
          . URL: https: //kurser.ku.dk/course/nfyk13011u.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20] University of Copenhagen,
          <source>High Performance Parallel Computing</source>
          ,
          <year>2021</year>
          . URL: https:// kurser.ku.dk/course/nfyk18001u/.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Berthold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bardino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Vinter</surname>
          </string-name>
          ,
          <article-title>A principled approach to grid middleware</article-title>
          , in: Y.
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Cuzzocrea</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Hobbs</surname>
          </string-name>
          , W. Zhou (Eds.),
          <article-title>Algorithms and Architectures for Parallel Processing</article-title>
          , volume
          <volume>7016</volume>
          of Lecture Notes in Computer Science, Springer, Berlin, Heidelberg,
          <year>2011</year>
          , pp.
          <fpage>409</fpage>
          -
          <lpage>418</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -24650-0{\_}
          <fpage>35</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Project</surname>
            <given-names>Jupyter</given-names>
          </string-name>
          , About us,
          <year>2021</year>
          . URL: https://jupyter.org/about.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>F.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. E. Granger,</surname>
          </string-name>
          <article-title>IPython: A system for interactive scientific computing</article-title>
          ,
          <source>Computing in Science Engineering</source>
          <volume>9</volume>
          (
          <year>2007</year>
          )
          <fpage>21</fpage>
          -
          <lpage>29</lpage>
          . doi:
          <volume>10</volume>
          .1109/
          <string-name>
            <surname>MCSE</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <volume>53</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Project</surname>
            <given-names>Jupyter</given-names>
          </string-name>
          ,
          <source>JupyterLab Documentation</source>
          ,
          <year>2018</year>
          . URL: http://jupyterlab.readthedocs.io/ en/stable/.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Project</surname>
            <given-names>Jupyter</given-names>
          </string-name>
          , JupyterHub,
          <year>2020</year>
          . URL: https://pypi.org/project/jupyterhub/.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>J.</given-names>
            <surname>Crist</surname>
          </string-name>
          , Spawners,
          <year>2019</year>
          . URL: https://github.com/jupyterhub/jupyterhub/wiki/Spawners.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <source>[27] wrapspawner for Jupyterhub</source>
          ,
          <year>2020</year>
          . URL: https://github.com/jupyterhub/wrapspawner.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>S.</given-names>
            <surname>Proskura</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Lytvynova,</surname>
          </string-name>
          <article-title>The approaches to web-based education of computer science bachelors in higher education institutions</article-title>
          , volume
          <volume>2643</volume>
          ,
          <string-name>
            <surname>CEUR-WS</surname>
          </string-name>
          ,
          <year>2020</year>
          , pp.
          <fpage>609</fpage>
          -
          <lpage>625</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2643</volume>
          /paper36.pdf, 7th Workshop on Cloud Technologies in Education,
          <source>CTE 2019 ; Conference Date: 20 December</source>
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>GitHub</surname>
          </string-name>
          , Where the world builds software,
          <year>2021</year>
          . URL: https://www.github.com.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Google</surname>
          </string-name>
          ,
          <source>Google Docs: Free Online Documents for Personal Use</source>
          ,
          <year>2021</year>
          . URL: https://www. google.com/docs/about/.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Google</surname>
          </string-name>
          , Welcome to Colaboratory,
          <year>2021</year>
          . URL: https://colab.research.google.com/ notebooks/intro.ipynb.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Kaggle</given-names>
            <surname>Inc</surname>
          </string-name>
          .,
          <source>Kaggle: Your Machine Learning and Data Science Community</source>
          ,
          <year>2019</year>
          . URL: https://www.kaggle.com.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Project</surname>
            <given-names>Jupyter</given-names>
          </string-name>
          , Binder,
          <year>2017</year>
          . URL: https://mybinder.org/.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Microsoft</surname>
          </string-name>
          , Microsoft Azure Notebooks,
          <year>2021</year>
          . URL: https://notebooks.azure.com.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Sagemath</surname>
          </string-name>
          , Inc.,
          <source>CoCalc - Collaborative Calculation and Data Science</source>
          ,
          <year>2021</year>
          . URL: https: //cocalc.com.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>JetBrains</surname>
          </string-name>
          , Datalore - Online
          <source>Data Science Notebook by JetBrains</source>
          ,
          <year>2020</year>
          . URL: https:// datalore.jetbrains.com.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>The</given-names>
            <surname>Binder</surname>
          </string-name>
          <string-name>
            <surname>Team</surname>
          </string-name>
          ,
          <source>Frequently Asked Questions - Binder 0.1b documentation</source>
          ,
          <year>2017</year>
          . URL: https://mybinder.readthedocs.io/en/latest/faq.html.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>Kaggle</given-names>
            <surname>Inc</surname>
          </string-name>
          .,
          <source>Kaggle Notebooks Documentation</source>
          ,
          <year>2021</year>
          . URL: https://www.kaggle.com/docs/ notebooks.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <surname>Google</surname>
          </string-name>
          , Colaboratory: Frequently Asked Questions ,
          <year>2021</year>
          . URL: https://research.google. com/colaboratory/faq.html.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Microsoft</surname>
          </string-name>
          , Azure Notebooks Overview,
          <year>2019</year>
          . URL: http://web.archive. org/web/20200818200412/https://docs.microsoft.com/en-us/azure/notebooks/ azure-notebooks-overview.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <surname>Microsoft</surname>
          </string-name>
          ,
          <article-title>Quickstart: Create a project with a custom environment</article-title>
          ,
          <year>2018</year>
          . URL: http://web.archive.org/web/20190607015705/https://docs.microsoft.com/en-us/azure/ notebooks/quickstart-create
          <article-title>-jupyter-notebook-project-environment.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <surname>Sagemath</surname>
          </string-name>
          , Inc., What is CoCalc?,
          <year>2021</year>
          . URL: https://doc.cocalc.com/index.html.
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <surname>JetBrains</surname>
          </string-name>
          , Billing documentation,
          <year>2021</year>
          . URL: https://datalore.jetbrains.com/documentation.
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>Kaggle</given-names>
            <surname>Inc</surname>
          </string-name>
          .,
          <source>Eficient GPU Usage Tips and Tricks</source>
          ,
          <year>2020</year>
          . URL: https://www.kaggle.com/ page/GPU-tips
          <article-title>-and-tricks.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <surname>Red</surname>
            <given-names>Hat Inc.</given-names>
          </string-name>
          , What is orchestration?,
          <year>2021</year>
          . URL: https://www.redhat.com/en/topics/ automation/what-is-orchestration.
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <surname>Terraform</surname>
            ,
            <given-names>Terraform</given-names>
          </string-name>
          <string-name>
            <surname>Documentation</surname>
          </string-name>
          ,
          <year>2021</year>
          . URL: https://www.terraform.io/docs/.
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <surname>Puppet</surname>
          </string-name>
          ,
          <article-title>Powerful infrastructure automation</article-title>
          and delivery,
          <year>2021</year>
          . URL: https://puppet.com.
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <surname>Chef</surname>
            ,
            <given-names>Chef</given-names>
          </string-name>
          <string-name>
            <surname>Infra</surname>
          </string-name>
          ,
          <year>2021</year>
          . URL: https://www.chef.io/products/chef-infra.
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <surname>Red</surname>
            <given-names>Hat</given-names>
          </string-name>
          , Inc., Ansible is Simple IT Automation ,
          <year>2021</year>
          . URL: https://www.ansible.com.
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <surname>Oracle</surname>
            <given-names>Corporation</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oracle Cloud Infrastructure</surname>
            <given-names>CLI</given-names>
          </string-name>
          ,
          <year>2019</year>
          . URL: https://github.com/oracle/ oci-cli.
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [51]
          <string-name>
            <given-names>Amazon</given-names>
            <surname>Web</surname>
          </string-name>
          <string-name>
            <surname>Services</surname>
          </string-name>
          , Inc.,
          <source>AWS Command Line Interface</source>
          ,
          <year>2021</year>
          . URL: https://aws.amazon. com/cli/.
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          [52]
          <string-name>
            <given-names>Amazon</given-names>
            <surname>Web</surname>
          </string-name>
          <string-name>
            <surname>Services</surname>
          </string-name>
          , Inc., Tools to build on AWS:
          <article-title>Tools for developing and managing applications on</article-title>
          AWS,
          <year>2021</year>
          . URL: https://aws.amazon.com/tools/.
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          [53]
          <string-name>
            <surname>OpenStack</surname>
          </string-name>
          , Open Source Cloud Computing Infrastructure - OpenStack,
          <year>2021</year>
          . URL: https: //www.openstack.org.
        </mixed-citation>
      </ref>
      <ref id="ref54">
        <mixed-citation>
          [54]
          <string-name>
            <given-names>OpenNebula</given-names>
            <surname>Systems</surname>
          </string-name>
          , OpenNebula - Open
          <source>Source Cloud &amp; Edge Computing Platform</source>
          ,
          <year>2021</year>
          . URL: https://opennebula.io.
        </mixed-citation>
      </ref>
      <ref id="ref55">
        <mixed-citation>
          [55]
          <string-name>
            <surname>INDIGO - DataCloud</surname>
          </string-name>
          , INDIGO DataCloud ,
          <year>2020</year>
          . URL: http://web.archive.org/web/ 20200512041341/https://www.indigo-datacloud.eu/.
        </mixed-citation>
      </ref>
      <ref id="ref56">
        <mixed-citation>
          [56]
          <string-name>
            <given-names>M.</given-names>
            <surname>Caballer</surname>
          </string-name>
          , S. Zala, a. L. Garc´a, G. Molt´,
          <string-name>
            <given-names>P. O.</given-names>
            <surname>Fernández</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Velten</surname>
          </string-name>
          ,
          <article-title>Orchestrating complex application architectures in heterogeneous clouds</article-title>
          ,
          <source>Journal of Grid Computing</source>
          <volume>16</volume>
          (
          <year>2018</year>
          )
          <fpage>3</fpage>
          -
          <lpage>18</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10723-017-9418-y.
        </mixed-citation>
      </ref>
      <ref id="ref57">
        <mixed-citation>
          [57]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kovács</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kacsuk</surname>
          </string-name>
          ,
          <article-title>Occopus: a multi-cloud orchestrator to deploy and manage complex scientific infrastructures</article-title>
          ,
          <source>Journal of Grid Computing</source>
          <volume>16</volume>
          (
          <year>2018</year>
          )
          <fpage>19</fpage>
          -
          <lpage>37</lpage>
          . doi:
          <volume>10</volume>
          .1007/ s10723-017-9421-3.
        </mixed-citation>
      </ref>
      <ref id="ref58">
        <mixed-citation>
          [58]
          <string-name>
            <given-names>R.</given-names>
            <surname>Munk</surname>
          </string-name>
          , multiplespawner,
          <year>2021</year>
          . URL: https://github.com/ucphhpc/multiplespawner.
        </mixed-citation>
      </ref>
      <ref id="ref59">
        <mixed-citation>
          [59] NERSC, sshspawner,
          <year>2020</year>
          . URL: https://github.com/NERSC/sshspawner.
        </mixed-citation>
      </ref>
      <ref id="ref60">
        <mixed-citation>
          [60]
          <article-title>NVIDIA, TensorFlow 2 quickstart for beginners</article-title>
          ,
          <year>2021</year>
          . URL: https://www.tensorflow.org/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>