A Low Carbon Kubernetes Scheduler Aled James Daniel Schien Email: aledjms@gmail.com University of Bristol, UK Email: Daniel.Schien@bristol.ac.uk Abstract—A major source of global greenhouse gas emissions renewable (‘green’) energy as well as fossil fuel or nuclear is the burning of fossil fuels for the generation of electricity. based energy sources (‘brown energy’) in order to compensate The portion of electricity generated from fossil fuel varies across for the intermittent nature of renewable energy generation. So- regions, and within a region with demand for electricity and the availability of renewable energy sources. Cloud providers operate lar photovoltaic (PV) power production primarily depends on data centres in locations around the planet. And certain kind of the amount of solar irradiation (insolation) reaching the solar server computation can tolerate migrating between data centres. panel; however, that irradiation is not uniformly distributed In this paper we describe the design and implementation of over time [7]. In addition to the rotation of the earth, weather a low carbon scheduling policy for the open-source Kubernetes and intermittend clouds block the Sun’s rays and thus influence container orchestrator. We apply this scheduler in a form of demand side management by migrating consumption of electric solar power generation output. energy to countries with the lowest carbon intensity of electricity. Intermittency of availability of renewable energy sources is The primary contributions of this text are (i) the scheduler’s one of the factors driving demand side management (DSM) in design, which provides a generic model for optimising workload the electric grid where consumers of electric grid alter their placement in regions with the lowest carbon intensity (ii) an energy consumption patterns. In the area of energy systems evaluation of its performance in a case study with a major public cloud provider (iii) an implementation of a demand side management, demand side management (contrasts with supply management solution that consumes electricity where, instead of side interventions) refers to any initiatives (technical interven- when, grid carbon intensity is lowest. tions, pricing models and monetary incentives) that affect how Index Terms—Kubernetes; green computing; DSM; Demand and when electricity is being required by consumers. While Side Management; renewable energy; grid carbon intensity much of the research on DSM focusses on domestic energy consumption there has also been work investigating DSM by I. I NTRODUCTION cloud data centres. Cloud datacentres typically comprise tens to thousands of An important form of DSM is load shifting, whereby interconnected servers and consumes a substantial amount of load on the electric grid (i.e. demand for electric energy) is electrical energy [1]. [2] estimates that by 2030 datacentres rescheduled to a time of day during which the energy demand will use anywhere between 3% and 13%1 of global electricity. can be more easily met by renewable resources [8]. Fig. 1 All major cloud computing companies acknowledge the need provides a basic visualisation of the load shape objective of to run their datacentres as efficiently as possible in order to Load Shifting Demand Side Management. address economic and environmental concerns, and recognise In this paper we describe the proof of concept design that ICT consumes an increasing amount of energy. As an and implementation of a low carbon scheduling policy for example for a response, Google Cloud Platform runs its the open-source Kubernetes container orchestrator that can datacentres entirely on renewable energy since the end of provide DSM for cloud data centres. The scheduler selects 2017 [3], while Microsoft have announced that their global compute nodes based on the real-time carbon intensity of operations have been carbon neutral since 2012 [4]. Not all the electric grid in the region they are in. Real-time APIs cloud providers have been able to make such an extensive that report grid carbon intensity is available for an increasing commitment; Oracle Cloud, for example, is currently 100% number of regions; but not exhaustively around the planet. carbon neutral in Europe, but not in other regions [5]. Much In order to effectively demonstrate the schedulers ability to of the aforementioned companies’ claims come with the caveat perform global load balancing we evaluate the scheduler based that their carbon emissions are not zero, but are offset by on its ability to the metric of solar irradiation. financial instruments which invest in future renewable energy The text is organised as follows. In the next section we generation or carbon capture; these future reductions are then look at existing work related to energy consumption of cloud netted off against the current year’s greenhouse gas emissions computing. In section III we derive the design of the scheduler [6]. – the implementation of which is detailed in section IV. In As the availability of renewable energy at a particular section V we evaluate the implementation before we conclude location is inherently variable, the electricity in the local grid in section VI. that datacentres draw from typically is generated from both II. BACKGROUND AND RELATED WORK Support for this research was generously provided by Microsoft Azure 1 The study’s authors acknowledge that the worst-case scenario of 13% is In their taxonomy of DSM techniques [9] list four main ‘exorbitant’, but ‘not totally unrealistic’ approaches: (a) energy efficiency, (b) time of use, (c) demand conventional power generation methods are required to meet demand during the downward (during sunrise) and upward (during sunset) sloping sections of the curve. Electricity consumption GreenSlot, proposed by Goiri et al. [12], is a scheduler which predicts the amount of solar energy that will be available in the near future, and schedules the workload to maximise green energy consumption while meeting the deadlines speci- fied by the job submitter. Greenslot, however, operates within a datacentre rather than between distributed datacentres, as we propose. Additionally, the system was implemented for two specific schedulers (SLURM and the MapReduce scheduler of Hadoop); not for Kubernetes. GreenSlot increased green energy usage by 19-21% by delaying jobs so that they are Time of day executed during periods of high green energy production or low brown energy prices [13]. GreenWorks is a framework that was proposed by Li et Fig. 1. Demand Side Managment (DSM) strategy - Load Shifting. The ‘duck curve’ of solar power generation can be observed, with energy generation al. [14] for datacentres powered by hybrid renewable energy peaking in the middle of the day systems. The framework considers the timing behaviours and [8] capacity constraints of different energy sources that are avail- able to a singular datacentre and makes optimal decisions response and (d) spinning reserve. Time of use refers to based on the energy mix available at any time. scheduling energy consumption outside of peak times. De- Wang et al. [15], consider a mix of green and brown mand response refers to reduction of electricity demand either energy sources, and use the following formula to implement via direct control of devices by the electric utility provider a k-nearest neighbour (k-NN) based algorithm to forecast or using electricty tariffs in order to create incentives for the solar energy level generation for the next day and make consumers to alter their behaviour. Spinning reserve refers to VM placement decisions accordingly.Additionally, their model the capability of some energy consuming devices to reduce is renewable-energy aware and considers the energy cost of their power consumption in response to changes to the grid datacentre cooling [15]. frequency that results from load in the grid. Among these four As highlighted by Brancucci et al. [11], Goiri et al. [13], Li categories, energy efficiency measures are most desirable as et al. [14] and Wang et al. [15], several solar power forecasting they result in long term reductions of energy consumption and technologies currently exist and are continuously improving, thus cost. with modelling efforts accelerating thanks to advancements in machine learning techniques. Antonanzas et al. also consider A. Energy-efficient cloud computing very short term forecasting, denoted as intra-hour or ‘now- Greater datacentre energy efficiency may be achieved casting’ [7]. All these works consider scheduling with regard through a number of different methods. Some of these, com- to renewable energy sources, but do so with consideration to piled by Zakarya and Gillam [10], are outlined on Fig.2. Some singular datacentres rather than taking a more global view, of these methods will be further explored in relation to cloud and deal with a mix of renewable and non-renewable energy computing in the proceeding section. The research outlined sources rather than variable renewable energy sources exclu- in this project chiefly pertains to the ‘load balancing’ and sively [15]. ‘renewables’ rows listed in Fig.2. The Low Carbon Scheduler on the other hand considers carbon intensity across regions as scaling up and down of a B. Green datacentres and renewable energy large number of containers can be done in a matter of seconds. Solar power generation is characterised by variability and Nonetheless, these works provide an understanding of green uncertainty. Business decisions considering where best to computing at the WSC level, knowledge of which can under- install photovoltaic (PV) arrays rely on historical solar irra- stand the trade-offs that are to be made in a geographically diation data, which measure the solar energy that reaches the distributed scheduler. earth’s surface over a long-term period. This usable energy varies according to latitude, elevation, season, and climate. C. Geographically distributed green datacentres The value of more short-term, namely day-ahead, solar power Reasoning that it is cheaper to transmit data over large dis- forecasting is discussed in Brancucci et al.’s 2017 paper [11], tances than it is to transmit power, one of the first papers that and indicates that such forecasting can lead to a reduction in suggested locating data centres near renewable energy sources overall solar energy generation costs. The paper discusses the was written in 2008 by Hopper and Rice [16]. As grids and ‘duck curve’, in which solar power generation is observed to datacentres are located in multiple regions that span the globe, be highest during the middle of the day, and can account for each is powered by different mixes of both green and brown a greater share of electrical power generation; however, more energy sources. Routing more user requests to the region Fig. 2. Current approaches to datacentres energy efficiency [10] Technique Explanation Benefits Shortcomings Virtualisation Dynamically provision re- Efficient energy saving Widely used, VM live mi- sources gration affects network per- formance Server consolidation and Reduces active servers by Increases the utilisation ra- Consequences from failure encapsulating application consolidating the workload tio of servers, reduce SLA of single consolidated of multiple servers violation ratio server DCP (Dynamic Capacity Adjust the available re- More energy-efficient Involves cost of switching Planning) sources tocurrent demand resources on/off; could vio- late customers’ SLAs Load Balancing Balance the workload Equal utilisation Challenging to implement among different servers to in a heterogeneous platform level out average utilisation Scheduling and VMs place- Place VMs onto a suit- Server and communication Planning and live migration ment able (most energy-efficient) system energy-efficient SLA violation servers Live migrations Migrate VMs from over- Less energy consumption Service level of running ap- utilised & under-utilised to plication affected more efficient servers Renewables Migrate VMs to servers op- More energy efficient and Renewables are intermittent erated by renewable energy economical & involve migrations that sources cost extra energy which is powered by cheaper production technology not only the scheduler considers local air temperature when making helps to save energy, but, according to Zakarya and Gillam placement decisions (see section IV-A). [10], offer at least three further benefits: (i) renewable energy ‘Green geographic load balancing’ was also used in a in oversupply allows for energy to be fed back to the electricity paper by Islam et al. [24]; however, this was with the aim grid; (ii) a high supply of renewables decreases demand for of rationing water consumption in datacentres rather than non-renewable sources from the electricity grid; (iii) lessened reducing carbon emissions. This is especially important during reliance on renewable energy storage reduces the costs of periods of drought, as experienced in California in recent management and replenishment of storage mechanisms, such years. Their algorithm, WATCH (WATer-constrained workload as batteries, and extends the life of these mechanisms. sCHeduling in data centers), dynamically dispatches workload Rahman et al. survey geographic load balancing of data across geographically distributed datacentres based on water centre workload [17]. Geographic load balancing for carbon availability. While this work does not directly relate to our reduction in the past has typically used request routing to direct proposed scheduler, the paper nonetheless demonstrates the demand relative to carbon intensity of electricity. Among feasibility of implementing a green scheduler, remembering them, [18] propose a traffic engineering framework, while [19] that ‘green’ is a term that can be applied to minimising propose a conceptual model based on Simulated Annealing consumption of natural resources in addition to encouraging optimisation. Here, we go beyond the model and propose renewable energy usage. a solution on the level of the infrastructure orchestration In 2012, Van Heddeghem et al. concluded that while provided by Kubernetes. deploying additional datacentres can help in reducing total Berl et al. [20] outlines how a geographically-distributed carbon footprint, substantial reductions could be achieved workload allocation system could work, and proposed moving when datacentres with nominal capacity well below maximum workload between datacentres if necessary in order to improve capacity redistribute workload to sites based on the availability energy efficiency. The paper focuses not on scheduling in of renewable energy [25]. The authors take a probabilistic accordance with low carbon electricity, but advocates allo- approach to chosing target data centres as opposed to our use cating work to cooler datacentres2 . A 2014 Paper by Zhang, of real-time API based on reported actual generation data. Shao et al. [22] estimated that 30 to 50% of a datacentre’s EcoPower is a system designed to perform eco-aware energy consumption comprises ‘cooling energy’. Oro and power management and load scheduling for geographically Salom came to a figure of 40% in 2015 [23]. For this reason distributed green cloud datacentres [26]. While the paper 2 Facebook published some details of their datacentres in Sweden, which concludes that wind-dominant, solar-complementary strategy highlighted their efforts to locate their datacentres in colder climes in order is superior for the integration of renewable energy sources into to minimise datacentre cooling costs [21] cloud datacentres’ infrastructure, the Low Carbon Scheduler Country Microsoft [29] Google [30] Amazon [31] Oracle [32] Australia x x x report the volume of electricity input to the grid in regular in- Belgium x tervals to the organisations operating the grid (for example the Brazil x x x x National Grid in the UK). Increasingly, this production data is Canada x x x China x x made available in real-time via APIs. For the European Union Denmark x such an API is provided by the European Network of Trans- Finland x mission System Operators for Electricity (www.entsoe.eu) and France x x Germany x x x x for the UK this is the Balancing Mechanism Reporting Service India x x x (www.elexon.co.uk). These APIs typically provide the retrieval Ireland x x of the production volumes and thus allow to calculate the Japan x x x Korea x x carbon intensity in real-time [33]. Our low carbon scheduler Malaysia x collects the carbon intensity from the available APIs and ranks Netherlands x x them to identify the region with lowest carbon intensity. Norway x Singapore x III. D ESIGN South Africa x Sweden x Kubernetes has been adopted and adapted for the purpose Switzerland x Taiwan x of scheduling workload around the globe. While section III-C UAE x outlines the design decisions made in order to enable a low UK x x x x carbon scheduling policy, a brief overview of Kubernetes and US-CA x x x x US-East x x x x the role of scheduling within Kubernetes is provided first. US-central x x x x A. Kubernetes and container orchestration Kubernetes, initially developed by Google and open-sourced provides a proof-of-concept demonstrating how to reduce in 2015, is based on the company’s experience of running carbon intensity in cloud computing. Also, the Low Carbon containers internally on Google’s own WSCs using its pro- Scheduler focuses on Kubernetes workloads, which is not the prietary Borg system [34]. The source code for Kubernetes case with EcoPower. Calculating energy usage is also widely was released in July 2015, and has grown to have more pull explored in the work of Khosravi et al. [27] on geographically requests and issue comments than any of the 54 million other distributed cloud datacentres. projects on GitHub [35]. Kubernetes was later donated to Hasan et al. discuss green cloud computing from a business the Cloud Native Computing Foundation, part of the Linux cloud user’s perspective: companies may choose to specify Foundation. a requirement for green energy usage in their Service Level The user provides the Kubernetes master3 with the desired Agreements (SLAs) with cloud computing providers [28]. In cluster configuration, typically in YAML format. Once the de- the paper they extend the Cloud Service Level Agreement sired state has been declared to the master, Kubernetes initiates (CSLA) language in order to incorporate two new threshold a reconciliation process to match the desired state of the cluster parameters that ensure that more environmentally sustainable with the current, actual, state. Once the desired cluster state policies are adhered to. The incentivisation of the Low Carbon has been achieved, the Kubernetes controllers are in an active Scheduler is discussed in section IV reconciliation process, i.e. they monitor for changes made to either the desired state (through user input) or the current state D. Geographically distributed cloud datacentres (through node or netwok failures, for example), and ensures that if a change is detected, the Kubernetes controllers carries The largest public Cloud providers operate data centres out the required operations to match the cluster’s current state around the planet. Table II-D lists the countries as of April with its desired state [36]. 2019. Kubernetes can make use of GPUs4 and has also been ported to run on ARM architecture5 . Kubernetes has to a large E. Real-time carbon intensity extent won the container orchestration war [41], [42]. This, Electricity in national electric grids is generated from a coupled with Kubernetes’s support for extendability and plug- variable mix of alternative sources. The carbon intensity of ins makes Kubernetes the most suitable for which to develop a the electricity provided by the grid anywhere in the world global scheduler and bring about the widest adoption, thereby is a measure of the amount of greenhouse gas released into producing the greatest impact on carbon emission reduction. the atmosphere from the combustion of fossil fuels for the 3 Through CLI/GUI/API generation of electricty. The carbon intensity is calculated as 4 Used extensively for Machine Learning and other GPU-intensive tasks the sum of the carbon intensity of the various energy sources such as graphics rendering - NVIDIA have released a container for use weighted by the relative production volumes per energy source with GPUs [37]. Microsoft has made use of Kubernetes for running deep (i.e. fuel type). The dominant types of fossil fuel used for learning models [38]. Kubernetes has also been used with great success for bioinformatic analysis [39] electricity generation are gas and coal. Significant generation 5 Which uses substantially less power than the CPUs of traditional server sites (excluding, for example, domestic solar PV installations) and desktop computers [40] Term Definition scheduling rules are called Priorities; these are scheduling Pod The atomic unit of a Kubernetes cluster; a rules that rank the remaining nodes according to preferences8 . group of one or more containers with shared A scheduling policy is a particular combination of predicates storage/network, and a specification for how and priorities. to run the containers nested within the pod The scheduler specifically Master Provides the cluster’s control plane. Master • (a) looks for Pods that aren’t assigned to a node (unbound components make global decisions about the Pods), cluster (for example, scheduling), and detect- • (b) examines the state of the cluster (cached in memory), ing and responding to cluster events, such as • (c) picks a node that has free space and meets other restarting stopped pods. constraints Fig. 3. Kubernetes architecture and basic terminology • (d) binds that Pod to a node. If multiple nodes are [43] assigned the same priority, a node is chosen at random [47] As outlined on Fig. 3, the Kubernetes master performs a C. Extending the Kubernetes scheduler number of roles, among them scheduling. Kubernetes allows The official Kubernetes documentation describes three pos- for schedulers to run in parallel, meaning that the scheduler sible ways of extending the default scheduler (kube-scheduler) will not need to re-implement the pre-existing, and sophisti- [48]: (i) adding these rules to the scheduler source code and cated, bin-packing strategies present in Kubernetes. It need recompiling, (ii) implementing one’s own scheduler process only apply a scheduling layer to compliment the existing that runs instead of, or alongside kube-scheduler, or (iii) capabilities proffered by Kubernetes. implementing a scheduler extender. B. Scheduling in Kubernetes D. Air temperature and solar irradiance Kubernetes builds on work that was done at Google for managing its internal cluster, called Borg [34], and later on As described in the literature review in section II-C, the lo- a project called Omega6 [45]. Facebook is believed to use a cal air temperature surrounding a datacentre affects the amount similar service called Tupperware [34]. Google’s publication of energy needed for cooling; air temperatureis therefore a of ‘Large-scale Cluster Management at Google with Borg’ [34] relevant consideration when the scheduler selects the most proved to be seminal, and is counted as the key publication suitable datacentre for workload allocation. In the scheduler’s on which Kubernetes is based. A number of features and design, two datacentres with similarly-carbon intense grid concepts from Borg have been brought forward to Kubernetes, electricity are further ranked by temperature, with the cooler including API Servers, Pods, IP-per-Pod, Services, Labels location prioritised for the (re)allocation of the specified [34]. The Omega paper also provides a useful description workload. of scheduler interference [45], whereby multiple schedulers may attempt to claim the same resource simultaneously. The E. Carbon emission model Omega paper explains that two approaches can be used to In this subsection we describe a brief conceptual model of mitigate this: a pessimistic approach which ensures that a the carbon emissions associated to computation and migration particular resource is only made available to one scheduler of work. at a time, and an optimistic approach, which detects conflicts Carbon emissions that result from the consumption of and undoes one or more of the conflicting claims [45]. Our electric energy can be calculated as the product of the electric design, as it operates at a higher level of abstraction, assures energy E and the carbon intensity of electricity I, thus E · I. that Kubernetes continues to deal with bin-packing at the node Compute work drives the consumption of electric energy level, while the scheduler performs global-level scheduling EC (energy compute in data centres and networks mainly between datacentres. in three ways. Most importantly, electric energy is required The default scheduling algorithm used by Kubernetes is for servers during the runtime t of the computation. The succinctly explained in a README file in the source code: power consumption P is a function of the varying utilisation There are two steps before a destination node of a of compute resources R (e.g. CPU, memory, IO) over time. Pod is chosen. The first step is filtering all the nodes Thus, EC = P (u(t))dt. Secondly, electric energy is also and the second is ranking the remaining nodes to consumed during transmission over the network of any data find a best fit for the Pod. [46] (input to or results of computation), labeled EN (energy The scheduler evaluates all the nodes in the cluster based network). This energy consumption is proportional to the on a number of rules, known as Predicates; these are schedul- volume of data transferred [49]. Finally, there is a ramp-up ing rules that filter out unqualified nodes7 . Another set of overhead from deploying the Kubernetes service in the target location ER. 6 Kubernetes is in fact claimed to be in many ways superior to Borg and Omega [44] 8 Among the most commonly used are ImageLocalityPriority, Balance- 7 PodFitsResources, PodFitsPorts, MatchNodeSelector etc. dResourceAllocation, LeastRequestedPriority Carbon emissions can be reduced by migrating a Kubernetes the APIs of the Azure platform. The introduction of additional deployment if public clouds is straight forward as described in IV-B. ECA IA > ECB IB + ERB + ENAB IAB 1 The scheduler receives the carbon intensity values for all with ECA , ECB the compute energy in data centre A and B, viable datacentre regions. Once the results have been received, IA , IB the carbon intensities in regions of data centre A and the scheduler ranks the locations. By default this ranking B, ERB the energy consumed for deploying the Kubernetes occurs in accordance carbon intensity and air temperature, but service, ENAB the energy consumed for transporting all can be modified, as demonstrated later. required data from A to B and IAB the average carbon Having determined the most suitable (i.e. ‘greenest’) dat- intensity in the network route from A to B [50]. acentre location, the program sends a request to the cloud In this model we assume PUE is similar between data centre Kubernetes or IaaS management API to provision a Resource A and B. Among these variables, the carbon intensities are Group at that datacentre, then verifies that this was successful. known to a Cloud customer via the carbon intensity APIs. Upon confirming the success of Resource Group creation a Cloud providers, i.e. data centre operators will also be able request to provision a Kubernetes cluster is sent. Typically, to determine any differences between ECA and ECB , ERB . a new cluster takes around 10 minutes to provision and for Cloud operators would also know if PUE (power utilisation the credentials to be agreed upon9 , and often an additional efficiency) differs between two locations. PUE factors can minute or two for all of Kubernetes’s internal components to simply be added as coefficients to either side of the relation. be in a ‘Ready’ status. In order to wait for this to happen, In our evaluation we present a concrete service for which the scheduler polls the cluster at regular intervals10 for the very little data has to be transported during migration of the status of its components. Once the cluster is in ‘Ready’ state Kubernet deployment, and that thus can be optimised from the specified Deployment is executed. After all resources have the perspective of the Cloud customer, in other words, based been created the Scheduler deletes the Resource Group in the on knowledge of IA and IB alone, assuming that the runtime region that was just determined to be less suitable11 . This energy consumption for identical computation in two locations design ensures that the next cluster is fully up and deployed A and B is similar, thus ECA ≈ ECB . before pulling down the previous cluster, ensuring that the F. Pseudo-code deployment is running continuously. It also addresses the issue of what would happen if such a scheduler were widely used, Algorithm 1 The Low Carbon Kubernetes Scheduler and if a large number of users were demanding resources from Require: kubectl the same datacentre: if the datacentre were overburdened with Require: cloudproviderCLI requests, it would simply return a message to indicate that P = (x, y) the deployment cannot be placed, allowing the workload to ID continue as normal in the previous region. greenestregion = AKS (Azure Container Service) [sic] - Managed Kuber- for all P do netes.. AKS reduces the complexity and operational overhead get carbon intensity of managing a Kubernetes cluster; however, this is only for all P do currently offered in three regions 12 . Also unable to scale down if ID = 0 then to zero pods13 delete end if A. Incentivisation sort by carbon intensity if I[loc0] u I[loc0] then In the first instance, the Low Carbon Scheduler will be for all P do appealing organisations aiming to increase the sustainability sortbyairtemp of their compute jobs. It could also play a role in ensuring that end for such green SLAs are adhered to by allowing some companies return topregion to opt in to a greener scheduling policy. One such proposal else for the Low Carbon Kubernetes Scheduler could be allowing return topregion the deployer of cloud resources to declare their deployment as end if ‘latency-insensitive’, which would permit cloud operators to end for end for 9 Using the host’s public SSH key, or a user-specified public SSH key wait30mins 10 in our implementation, every 20 seconds 11 Deleting the cluster alone won’t necessarily remove the cluster’s child resources. Ideally, in all areas that cannot generate renewable energy, CPU IV. I MPLEMENTATION cycles of any kind (including those made by the Kubernetes master) would be zero. This would be unnecessarily costly and energy-inefficient. In this section we describe the current implementation of 12 As of March 2018: centralus, eastus, westeurope [51] the design. At present the implementation is compatible with 13 CLI says it must be 1 or greater [52] schedule that workload in a manner that optimises demand- B. The Heliotropic Scheduler side decarbonisation14 . The list of least carbon intense countries only contains B. Extensibility countries in central Europe locations. In our evaluation of the Kubernetes extension and its ability for globally distribut- The software has been written to allow for easy extensibility. ing deployments we have chosen to optimise placement to Further metrics can be introduced to the code in order to regions with the greatest degree of solar irradiance, termed influence the datacentre scheduling decisions. The software’s a Heliotropic Scheduler. Solar irradiance varies more widely plugin package contains variables and suggested function than carbon intensity across global regions. declarations that would allow practically any kind of metric This scheduler is termed ‘heliotropic’ in order to differenti- to be passed to it, similar to the way that the Kubernetes ate it from a ‘follow-the-sun’ application management policy scheduler does. It would be possible, for example, to introduce as mentioned in the documentation to the cloud framework consideration of live cloud-region pricing data posted on Apache Brooklyn [54] [55] and in academic work [56]. While AzurePrice.net. Extensibility of the scheduler is important ‘follow-the-sun’ relates to meeting customer demand around in order to allow new metrics to be introduced to influence the world by placing staff and resources in proximity to those scheduling decisions. Some metrics, for example, are simply locations (thereby making them available to clients at a lower not available to the public15 , but would be useful for the latency and at a suitable time of day), a ‘heliotropic’ policy implementation of a carbon-aware scheduling policy [53]. goes to where sunlight, and by extension solar irradiance, is abundant. In order to facilitate extensibility of cloud providers beyond 1) Live solar irradiance data: As the scheduler reacts Azure, the source code strives to ensure that vendor-specific to changes in insolation in near real time, a good source commands are kept to their own packages. Azure-specific of live weather data is crucial for its correct functioning. commands are contained in the azacs16 package. Other de- Following a review of seven live weather APIs [57], Weath- velopers may then easily add functionality to the scheduler by erbit.io was chosen as it was the sole simulatenous provider introducing new packages for each cloud vendor. Additionally, of three metrics necessary for the Heliotropic Scheduler: air once AKS17 is supported in a greater number of regions, it temperature, windspeed, and live insolation data. This latter, would be a trivial task to customise the source code to use crucial measurement was derived from a metric called DHI, AKS instead of (or in addition to) ACS. or ‘Diffuse Horizontal Irradiance’. DHI signifies the amount It would have been possible to configure the scheduler to of radiation received on a horizontal surface that does not pull the deployment specification YAML from the running arrive on a direct path from the sun, but has been scattered cluster, and pass this configuration onto the next region, but by molecules and particles in the atmosphere [58]; it roughly storing the file in a GitHub ‘gist’ that the scheduler is aware corresponds to Watts generated per square metre 18 [59]. The of makes use of the practice in cloud computing known as veracity of the insolation data provided by Weatherbit.io could ‘infrastructure as code’. This relates to managing and provi- be verified by comparison with equivalent data from other sioning resources using definition files, rather than physical Weather API providers. hardware configuration. This makes builds more reproducable and allows for version control systems to be used to track C. BOINC files and reverse a non-functioning declaration to a previous, We evaluate our implementation of the Heliotropic Sched- working state. The scheduler can be easily configured to uler by running BOINC19 jobs on Kubernetes. BOINC (Berke- specify either a URL pointing to a raw YAML or JSON file, ley Open Infrastructure for Network Computing) is a soft- or to specify a locally-stored deployment configuration. ware platform for volunteer computing that allows users V. E VALUATION to contribute computational capacity from their home PCs (usually when the computer is idle) towards scientific re- A. Carbon Ranks search [60]. Among the most widely supported projects are We recorded the carbon intensities for the countries that Einstein@Home, SETI@home and IBM World Community the major cloud providers operate data centres in (see II-D) Grid20 . between 18.2.2019 13:00 UTC and 21.4.2019 9:00 UTC. We While any number of programs could have been chosen or then ranked all countries by the carbon intensity of their written to carry out compute workload on the heliotropically- electricity in 30 minute intervals. Among the total set of 30 scheduled cluster, BOINC was chosen, along with the IBM minute values Switzerland had the lowest carbon intensity Community Grid project, so that the project might contribute (ranked first) in 0.57% of the 30 minute intervals, Norway to scientific research rather than perform an arbitrary ‘number- 0.31%, France 0.11% and Sweden in 0.01%. crunching’ task of our own design. The BOINC client down- 14 Conceivably at a fractionally lower cost in order to incentivise its usage 18 Subsequent investigations into the Weatherbit API revealed that additional 15 Such as each datacentre’s green/brown energy mix and how much energy solar insolation metrics (DNI and GHI) were provided, but undocumented on storage capacity is at each location the Weatherbit website 16 Azure ACS (Azure Container Service) 19 rhymes with ‘oink’ 17 Azure’s managed Kubernetes service 20 As of January 2, 2018, 37 BOINC projects are active [61] loads raw data, processes them and then uploads the results back to the project servers before requesting additional work [62]. Choosing BOINC as the cluster workload therefore offers the advantage of there being no strong requirement for either low latency or persistent storage. A paper for further research regarding volunteer computing (specifically BOINC) in the cloud, by Montes, Añel et al. [63], demonstrates the suitability of BOINC for cloud computing in certain circumstances21 . This project’s work on BOINC [64], including a Dockerfile and publically available image, are available on Docker Hub22 . D. Results of evaluatory experiments Pages 8 to 9 show empirical results of the Heliotropic Scheduler placing workload in Microsoft Azure datacentres across the globe. Each column of graphs shows the varying DHI, the deployment location for the BOINC cluster over time together with a map of the datacentre locations. The first two tests show how the scheduler correctly identified the most suitable region based on insolation and allocated work to those regions as desired in the design specifications. Fig. 4 shows that the deployment was raised in australiaeast, in accordance with DHI, and remained there for the duration of the test. Fig. 5 shows that the deployment was raised in westeurope, in accordance with DHI, before scheduling itself heliotropi- cally to eastus, and later centralus. Fig. 6 demonstrates the scheduler’s extensibility. With a minimal amount of configuration, the scheduler operated on a follow-the-wind model. As wind power continues to be generated at night, a greater number of datacentres are in contention to be the most suitable. For this reason a number Fig. 4. Test 0 (DHI) of redeployments occur over the test’s time period. Depending on the nature of the work, datacentre migration might include the transfer of a significant amount of data. In this case, R EFERENCES reallocation thresholds can limit the number of migrations that [1] L. A. Barroso, J. Clidaras, and U. Hölzle, The Datacenter as a can occur over a period of time. Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition. Morgan and Claypool, 2013. [Online]. Available: VI. C ONCLUSION http://dx.doi.org/10.2200/S00516ED2V01Y201306CAC024 [2] A. S. G. Andrae and T. Edler, “On global electricity usage of We presented the design and implementation of a low communication technology: Trends to 2030,” Challenges, vol. 6, carbon scheduling policy for the open-source Kubernetes con- no. 1, pp. 117–157, 2015. [Online]. Available: http://www.mdpi.com/ tainer orchestrator. The implementation is fully functional and 2078-1547/6/1/117 [3] Urs Hölzle, “We’re set to reach 100 percent renewable energy could successfully migrate a Kubernetes deployment between — and it’s just the beginning,” 2016. [Online]. Available: https: global regions. //www.blog.google/topics/environment/100-percent-renewable-energy/ For cloud customers, the current optimisation model of [4] Microsoft, “Addressing our carbon footprint,” 2017. [Online]. Available: the scheduler is robust of for workloads that do not require https://www.microsoft.com/about/csr/environment/carbon/ [5] Oracle, “Move to the Cloud for Energy Efficiency,” significant data transport as part of the migration - such as 2017. [Online]. Available: https://www.oracle.com/solutions/green/ is the case with the BOINC workloads. Even though many cloud-operations.html cloud providers are contracting for renewable energy with [6] A. S. G. Andrae and T. Edler, “Google environment report: 2017 progress update,” 2017. [Online]. Avail- their energy providers, the electricity these data centres take able: https://static.googleusercontent.com/media/environment.google/en/ from the grid is generated with release of a varying amount of /pdf/google-2017-environmental-report.pdf greenhouse gas emissions into the atmosphere. Our scheduler [7] J. Antonanzas and N. Osorio and R. Escobar and R. Urraca and F.J. Martinez-de-Pison and F. Antonanzas-Torres, “Review of can contribute to moving demand for more carbon intense photovoltaic power forecasting,” Solar Energy, vol. 136, pp. 78 – 111, electricity to less carbon intense electricity. 2016. [Online]. Available: http://www.sciencedirect.com/science/article/ pii/S0038092X1630250X 21 The paper looks at running the BOINC client on Amazon’s Cloud [8] World Bank, “Primer on Demand-Side Management: With Services platform (AWS), and contributes to the ClimatePrediction.net project an Emphasis on Price-Responsive Programs,” World Bank 22 Docker Hub is a centralised resource for public and private container Other Operational Studies, 2005. [Online]. Available: images http://documents.worldbank.org/curated/en/986041468154163610/ Fig. 5. Test 1 (DHI) Fig. 6. Test 2 (DHI) Primer-on-demand-side-management-with-an-emphasis-on\protect\ [15] X. Wang, Z. Du, Y. Chen, and M. Yang, “A green-aware virtual discretionary{\char\hyphenchar\font}{}{}price-responsive-programs machine migration strategy for sustainable datacenter powered by [9] P. Palensky and D. Dietrich, “Demand side management: Demand re- renewable energy,” Simulation Modelling Practice and Theory, vol. 58, sponse, intelligent energy systems, and smart loads,” IEEE Transactions pp. 3 – 14, 2015, special Issue on Techniques and Applications on Industrial Informatics, vol. 7, no. 3, pp. 381–388, 2011. for Sustainable Ultrascale Computing Systems. [Online]. Available: [10] M. Zakarya and L. Gillam, “Energy efficient computing, clusters, http://www.sciencedirect.com/science/article/pii/S1569190X15000155 grids and clouds: A taxonomy and survey,” Sustainable Computing: [16] A. Hopper and A. Rice, “Computing for the future of the Informatics and Systems, vol. 14, pp. 13 – 33, 2017. [Online]. Available: planet,” Philosophical Transactions of the Royal Society of London http://www.sciencedirect.com/science/article/pii/S2210537917300707 A: Mathematical, Physical and Engineering Sciences, vol. 366, [11] Carlo Brancucci Martinez-Anido and Benjamin Botor and Anthony no. 1881, pp. 3685–3697, 2008. [Online]. Available: http://rsta. R. Florita and Caroline Draxl and Siyuan Lu and Hendrik F. royalsocietypublishing.org/content/366/1881/3685 Hamann and Bri-Mathias Hodge, “The value of day-ahead solar power [17] A. Rahman, X. Liu, and F. Kong, “A survey on geographic load forecasting improvement,” Solar Energy, vol. 129, pp. 192 – 203, balancing based data center power management in the smart grid 2016. [Online]. Available: http://www.sciencedirect.com/science/article/ environment,” IEEE Communications Surveys and Tutorials, vol. 16, pii/S0038092X16000736 no. 1, pp. 214–233, 2014. [12] I. Goiri, R. Beauchea, K. Le, T. D. Nguyen, M. E. Haque, J. Guitart, [18] S. K. Peter Xiang Gao, Andrew R. Curtis, Bernard Wong, “It’s Not Easy J. Torres, and R. Bianchini, “Greenslot: Scheduling energy consump- Being Green,” in SIGCOMM, 2012, pp. 2011–2013. tion in green datacenters,” in 2011 International Conference for High [19] K. Le, R. Bianchini, T. D. Nguyen, O. Bilgir, and M. Martonosi, Performance Computing, Networking, Storage and Analysis (SC), Nov “Capping the brown energy consumption of internet services at low 2011, pp. 1–11. cost,” 2010 International Conference on Green Computing, Green Comp [13] I. Goiri, M. E. Haque, K. Le, R. Beauchea, T. D. Nguyen, J. Guitart, 2010, pp. 3–14, 2010. J. Torres, and R. Bianchini, “Matching renewable energy supply and [20] A. Berl, E. Gelenbe, M. Di Girolamo, G. Giuliani, H. De Meer, demand in green datacenters,” Ad Hoc Networks, vol. 25, pp. 520 M. Q. Dang, and K. Pentikousis, “Energy-efficient cloud computing,” – 534, 2015, new Research Challenges in Mobile, Opportunistic and The Computer Journal, vol. 53, no. 7, pp. 1045–1051, 2010. [Online]. Delay-Tolerant Networks Energy-Aware Data Centers: Architecture, Available: http://dx.doi.org/10.1093/comjnl/bxp080 Infrastructure, and Communication. [Online]. Available: http://www. [21] Mark Zuckerberg, “Luleå data center.” September 2016. [Online]. sciencedirect.com/science/article/pii/S1570870514002649 Available: https://www.facebook.com/zuck/posts/10103136694875121 [14] C. Li, R. Wang, D. Qian, and T. Li, “Managing server clusters [22] H. Zhang, S. Shao, H. Xu, H. Zou, and C. Tian, “Free cooling on renewable energy mix,” ACM Trans. Auton. Adapt. Syst., of data centers: A review,” Renewable and Sustainable Energy vol. 11, no. 1, pp. 1:1–1:24, Feb. 2016. [Online]. Available: Reviews, vol. 35, pp. 171 – 182, 2014. [Online]. Available: http://doi.acm.org/10.1145/2845085 http://www.sciencedirect.com/science/article/pii/S1364032114002445 [23] E. Oró and J. Salom, “Energy model for thermal energy storage google-systems-guru-explains-why-containers-are-the-future-of\ system management integration in data centres,” Energy Procedia, -computing-87922af2cf95 vol. 73, pp. 254 – 262, 2015, 9th International Renewable [45] M. Schwarzkopf, A. Konwinski, M. Abd-El-Malek, and J. Wilkes, Energy Storage Conference, IRES 2015. [Online]. Available: http: “Omega: flexible, scalable schedulers for large compute clusters,” //www.sciencedirect.com/science/article/pii/S1876610215014526 in SIGOPS European Conference on Computer Systems (EuroSys), [24] M. A. Islam, S. Ren, G. Quan, M. Z. Shakir, and A. V. Vasilakos, Prague, Czech Republic, 2013, pp. 351–364. [Online]. Available: http:// “Water-constrained geographic load balancing in data centers,” IEEE eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf Transactions on Cloud Computing, vol. 5, no. 2, pp. 208–220, April [46] The Kubernetes Authors, “Scheduler Algorithm in Kubernetes,” 2017. 2017. [Online]. Available: https://github.com/kubernetes/community/ [25] W. V. Heddeghem, W. Vereecken, D. Colle, M. Pickavet, and blob/master/contributors/devel/scheduler algorithm.md P. Demeester, “Distributed computing for carbon footprint reduction by [47] Joe Beda, “Core Kubernetes: Jazz Improv over Or- exploiting low-footprint energy availability,” Future Generation Com- chestration,” 2017. [Online]. Available: https://blog.heptio.com/ puter Systems, vol. 28, no. 2, pp. 405 – 414, 2012. [Online]. Available: core-kubernetes-jazz-improv-over-orchestration-a7903ea92ca http://www.sciencedirect.com/science/article/pii/S0167739X11000859 [48] The Kubernetes Authors, “Scheduler extender,” 2017. [On- [26] X. Deng, D. Wu, J. Shen, and J. He, “Eco-aware online power manage- line]. Available: https://github.com/kubernetes/community/blob/master/ ment and load scheduling for green cloud datacenters,” IEEE Systems contributors/design-proposals/scheduling/scheduler extender.md Journal, vol. 10, no. 1, pp. 78–87, March 2016. [49] D. Schien and C. Preist, “Approaches to energy intensity of the [27] A. Khosravi, L. L. H. Andrew, and R. Buyya, “Dynamic vm place- internet,” IEEE Communications Magazine, vol. 52, no. 11, pp. 130– ment method for minimizing energy and carbon cost in geographically 137, nov 2014. [Online]. Available: http://ieeexplore.ieee.org/lpdocs/ distributed cloud data centers,” IEEE Transactions on Sustainable Com- epic03/wrapper.htm?arnumber=6957153 puting, vol. 2, no. 2, pp. 183–196, April 2017. [50] D. Schien, C. Preist, M. Yearworth, and P. Shabajee, “Impact of Location [28] Md Sabbir Hasan, and Yousri Kouki, and Thomas Ledoux, and Jean- on the Energy Footprint of Digital Media,” in IEEE International Louis Pazat, and undefined, and undefined, and undefined, and unde- Symposium on Sustainable Systems and Technology (IEEE ISSST 2012). fined, , “Exploiting Renewable Sources: When Green SLA Becomes Boston, MA: IEEE, 2012. a Possible Reality in Cloud Computing,” IEEE Transactions on Cloud [51] Microsoft, “Azure status,” January 2018. [Online]. Available: https: Computing, vol. 5, no. 2, pp. 249–262, 2017. //azure.microsoft.com/en-gb/status/ [29] Microsoft, “Microsoft Azure Data Centre Locations,” 2019. [On- [52] Microsoft Azure and Open Source Contributors, “’az aks scale’ should line]. Available: https://azure.microsoft.com/en-gb/global-infrastructure/ be able to scale #of linux and windows nodes #79,” 2017. [Online]. regions/ Available: https://github.com/Azure/AKS/issues/79 [30] Google, “Google Cloud Data Centre Regions,” 2019. [Online]. [53] C. Bussar, M. Moos, R. Alvarez, P. Wolf, T. Thien, H. Chen, Available: https://cloud.google.com/about/locations/ Z. Cai, M. Leuthold, D. U. Sauer, and A. Moser, “Optimal allocation and capacity of energy storage systems in a future [31] Amazon Web Services, “Amazon Web Service Data Centre european power system with 100% renewable energy generation,” Locations,” 2019. [Online]. Available: https://aws.amazon.com/ Energy Procedia, vol. 46, pp. 40–47, 2014. [Online]. Available: about-aws/global-infrastructure/ http://www.sciencedirect.com/journal/energy-procedia/vol/46 [32] Oracle, “Oracle Cloud Data Centre Locations,” 2019. [Online]. [54] Apache, “Apache Brooklyn Source Code: followthesun directory,” 2017. Available: https://cloud.oracle.com/data-regions [Online]. Available: https://github.com/apache/brooklyn-server/tree/ [33] B. Tranberg, O. Corradi, B. Lajoie, T. Gibon, I. Staffell, and master/policy/src/main/java/org/apache/brooklyn/policy/followthesun G. B. Andresen, “Real-Time Carbon Accounting Method for the [55] ——, “The Theory behind Brooklyn,” 2017. [Online]. Available: European Electricity Markets,” 2018. [Online]. Available: http: https://brooklyn.apache.org/learnmore/theory.html //arxiv.org/abs/1812.06679 [56] E. Carmel, J. A. Espinosa, and Y. Dubinsky, “”Follow the Sun” [34] Abhishek Verma, Luis Pedrosa, Madhukar Koupolu et al., “Large-scale Workflow in Global Software Development,” Journal of Management Cluster Management at Google with Borg,” 2015. [Online]. Available: Information Systems, vol. 27, no. 1, pp. 17–38, 2010. [Online]. Available: http://research.google.com/pubs/archive/44843.pdf http://www.tandfonline.com/doi/abs/10.2753/MIS0742-1222270102 [35] Sarah Novotny, “Happy Second Birthday: A Kubernetes [57] Todd Motto and Open Source Contributors, “A collective list of public Retrospective,” 2017. [Online]. Available: http://blog.kubernetes.io/ JSON APIs for use in web development,” 2017. [Online]. Available: 2017/07/happy-second-birthday-kubernetes.html https://github.com/toddmotto/public-apis#weather [36] The Kubernetes Authors, “Writing Controllers,” 2017. [On- [58] Bill Marion, Carol Riordan and David Renne, “Shining On: line]. Available: https://github.com/kubernetes/community/blob/8decfe4/ A Primer on Solar Radiation Data,” 1992. [Online]. Available: contributors/devel/controllers.md https://www.nrel.gov/docs/legosti/old/4856.pdf [37] NVIDIA and Open Source Contributors, “Build and run Docker [59] Weatherbit.io, “Current Weather API,” 2017. [Online]. Available: containers leveraging NVIDIA GPUs,” 2017. [Online]. Available: https://www.weatherbit.io/api/weather-current https://github.com/NVIDIA/nvidia-container-runtime [60] U. of California, “Boinc: Open-source software for volunteer [38] William Buchwalter and Rita Zhang, “Autoscaling Deep computing,” 2017. [Online]. Available: http://boinc.berkeley.edu/ Learning Training with Kubernetes,” November 2017. [On- [61] ——, “Choosing boinc projects,” 2018. [Online]. Available: http: line]. Available: https://www.microsoft.com/developerblog/2017/11/21/ //boinc.berkeley.edu/projects.php autoscaling-deep-learning-training-kubernetes/ [62] ——, “How boinc works,” 2017. [Online]. Available: http://boinc. [39] O. Markstedt, J. Persson, J. Andersson, and O. Spjuth, “Kubernetes as berkeley.edu/wiki/How BOINC works an approach for solving bioinformatic problems.” Uppsala universitet, [63] D. Montes, J. A. Añel, T. F. Pena, P. Uhe, and D. C. H. Teknisk-naturvetenskapliga vetenskapsområdet, Biologiska sektionen, Wallom, “Enabling boinc in infrastructure as a service cloud system,” Institutionen för biologisk grundutbildning, 2017. Geoscientific Model Development, vol. 10, no. 2, pp. 811–826, 2017. [40] Lucas Käldström, “Kubernetes on ARM Project,” 2017. [Online]. [Online]. Available: https://www.geosci-model-dev.net/10/811/2017/ Available: https://github.com/luxas/kubernetes-on-arm [64] obfuscated, “Docker hub public repository: obfuscated,” 2018. [Online]. [41] Portworx, “2017 Annual Container Adoption Survey: Huge Growth Available: https://hub.docker.com/r/obfuscated/ in Containers,” April 2017. [Online]. Available: https://portworx.com/ 2017-container-adoption-survey/ [42] T. Krazit, “The Cloud in 2017: Amazon Web Services shows no signs of slowing during the year of Kubernetes,” December 2017. [Online]. Available: https://www.geekwire.com/2017/ cloud-2017-amazon-web-services-shows-no-signs-slowing-year-kubernetes/ [43] The Kubernetes Authors, “Kubernetes Documentation (Home),” 2017. [Online]. Available: https://kubernetes.io/docs/home/ [44] E. Brewer, “Google systems guru explains why containers are the future of computing,” 2015. [Online]. Available: https://medium.com/s-c-a-l-e/