<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Recent Advances in Energy Efficient Query Processing</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Matteo</forename><surname>Catena</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National Research Council of Italy</orgName>
								<address>
									<settlement>Pisa</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Nicola</forename><surname>Tonellotto</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National Research Council of Italy</orgName>
								<address>
									<settlement>Pisa</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Recent Advances in Energy Efficient Query Processing</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">DCB432729B3DB84411BBA70B4EC0CEFE</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T22:05+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Web search companies distribute their infrastructures and operations across several, geographically distant data centers. This distributed architecture facilitates high performance query processing, which is fundamental for the success of a Web search engine. At the same time, data centers require an huge amount of electricity to operate their computing resources. In this extended abstract, we briefly discuss our recent works for improving the energy efficiency of query processing systems. Firstly, we introduce a novel query forwarding algorithm which exploits green energy sources to reduce the electricity expenditure and carbon footprint of Web search engines. Then, we propose to delegate the CPU power management from a server' operative system directly to the query processing application, to reduce the energy consumption of a search engine's servers. Finally, we introduce PESOS, a scheduling algorithm which manages the CPU power consumption on a per-query basis while considering query latency constraints.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>High performance query processing is fundamental for the success of a Web search engine. In fact, Web search engine can receive billions of queries per day <ref type="bibr" target="#b4">[5]</ref>. Additionally, the issuing users are often impatient and expect subsecond response times to their queries (e.g., 500 ms). Indeed, users become less engaged <ref type="bibr" target="#b0">[1]</ref> or migrate to other search services <ref type="bibr" target="#b10">[11]</ref> when a search engine fails to provide fast responses to queries. For such reasons, search companies adopt distributed query processing strategies to cope with huge volumes of incoming queries and to provide sub-second response times.</p><p>Web search engines perform distributed query processing on computer clusters composed by thousands of computers and hosted in large data centers <ref type="bibr" target="#b4">[5]</ref>. While such facilities enable large-scale online services, they also raise economical and environmental concerns. Indeed, a large-scale data center -like those used by Web search engines -can draw tens of megawatts of electricity to operate and it can cost 9 million US dollars per year in terms of energy expenditure <ref type="bibr" target="#b7">[8]</ref>. Therefore, an important problem to address is how to reduce the energy expenditure of data centers. Additionally, producing and consuming electricity can involve the emission of carbon dioxide, which is the main cause of global warming due to the greenhouse effect. In 2007, the Information and Communication Technology (ICT) sector has been reported to be responsible for roughly 2% of global carbon emissions, with general purpose data centers accounting for 14% of the ICT footprint. These emission levels were projected to more than double by 2020 <ref type="bibr" target="#b12">[13]</ref>. Therefore, another problem to tackle is how to reduce these emissions and the negative impact of the data centers on the environment.</p><p>Obviously, a possible solution to these challenges consists in designing more energy-efficient data centers, which consume less energy and, consequently, pollute and cost less. In the past, a large part of the energy consumption of a data center could be accounted to inefficiencies in its cooling and power supply systems. However, search companies already adopt state-of-the art techniques to reduce the energy wastage of such systems<ref type="foot" target="#foot_0">1</ref>,<ref type="foot" target="#foot_1">2</ref> , leaving little room for more improvements in those areas. Indeed, the energy consumption of a state-of-the-art data center would be reduced by less than 24% if all the overheads in its cooling and power supply systems were eliminated <ref type="bibr" target="#b1">[2]</ref>. Therefore, new approaches are necessary to mitigate the environmental impact and the energy expenditure of Web search engines.</p><p>One option consists in using green energy. In fact, several search companies use green energy to partially power their data centers, i.e., energy which comes from resources that are renewable and do not emit carbon dioxide, such as sunlight and wind 3,4,5 . At the same time, Web search engines experience spatial and temporal variations in electricity prices <ref type="bibr" target="#b9">[10]</ref> as they distribute their infrastructures and operations across several, geographically distant data centers <ref type="bibr" target="#b4">[5]</ref>. Stemming from these observations, we propose a novel query forwarding algorithm that exploits both the green energy sources available at different data centers and the differences in market energy prices <ref type="bibr" target="#b2">[3]</ref>. The main idea is to dispatch queries from the data center that firstly received the requests to a different one, if the latter can rely on green energy or cheaper energy sources than the former. The problem of exploiting different energy sources to reduce costs when forwarding queries is modeled as a Minimum Cost Flow Problem. The model takes into account the different and limited processing capacities of data centers, query response time constraints and communication latencies among sites. We evaluate the proposed algorithm using workloads obtained from the Yahoo search engine, together with realistic electricity price data. Our experimental results show that the proposed solution maintains an high query throughput, while reducing by up to 25% the energy operational costs of multi-center search engines. Moreover, our algorithm can reduce by almost 6% the consumption of non-green energy.</p><p>The energy expenditure and carbon footprint of a search company can also be mitigated by reducing the energy consumption of its computing resources. In particular, reducing the energy consumption of CPUs represents an attractive venue for Web search engines. In fact, CPUs are the most energy consuming component in servers dedicated to query processing, accounting for 40% of total energy consumption when a server is idle and for 66% of total energy consumption when it is fully utilized <ref type="bibr" target="#b1">[2]</ref>. Dynamic Voltage and Frequency Scaling (DVFS) technologies can be used to reduce the CPU energy consumption of a server <ref type="bibr" target="#b11">[12]</ref>. DVFS permits to adjust the frequency and voltage at which the CPU cores operate, trading off performance for power consumption. In fact, higher core frequencies mean faster computations but higher power consumption, while lower frequencies lead to slower computations but reduced power consumption. However, carefulness is required when reducing the operating frequency of the CPU cores since low frequencies entail longer query processing times that may be unacceptable for the users.</p><p>Typically, DVFS mechanisms are managed by operating system (OS) components, called frequency governors <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b13">14]</ref>. However, the OS misses domainspecific information regarding the utilization and load of the query processing application. This knowledge can be exploited to better throttle the frequency of the CPU cores, thereby reducing the power consumption of a query processing server. Therefore, in <ref type="bibr" target="#b5">[6]</ref> we propose to delegate the CPU power management from the OS frequency governors to the query processing application, and we devise search engine-specific frequency governors. We experimentally evaluate such governors upon the TREC ClueWeb09B corpus and the query stream from the MSN 2006 query log. Results show that the knowledge of the query processing server utilization and load facilitates a more refined control of the CPU to achieve power savings. In fact, the proposed search engine-specific governors can reduce up to 24% a server power consumption, with only limited (but uncontrollable) drawbacks in the quality of search results with respect to a system operating at maximum CPU frequency.</p><p>Another important aspect that can be exploited to reduce the energy consumption of a server is the fact that users can hardly notice response times that are faster than their expectations <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b10">11]</ref>. Therefore, we advise that Web search engines should not process queries faster than user expectations and, consequently, we propose the Predictive Energy Saving Online Scheduling (PESOS) algorithm <ref type="bibr" target="#b6">[7]</ref>. PESOS selects the most appropriate CPU frequency to process a query by its deadline, on a per-core basis. It considers the latency requirement of queries as an explicit parameter, and it tries to process queries no faster than required. In doing so, the CPU energy consumption is reduced while respecting the query latency constraints. PESOS bases its decision on query efficiency predictors, which are techniques to estimate the processing volume and processing time of a query before its execution <ref type="bibr" target="#b8">[9]</ref>. We experimentally evaluate PESOS upon the TREC ClueWeb09B collection and the MSN 2006 query log. Depending on the required latency, results show that PESOS can reduce the CPU energy consumption of a query processing server from 24% up to 48% when compared to an high performance system running at maximum CPU core frequency. Also, PESOS outperforms our best search engine-specific frequency governor <ref type="bibr" target="#b5">[6]</ref> with a 20% energy saving, while the competitor requires a fine parameter tuning and it may incurs in uncontrollable latency violations.</p></div>			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://www.google.com/about/datacenters/efficiency/internal/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://www.microsoft.com/about/csr/downloadhandler.ashx?Id=02-01-12</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://environment.google/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://www.microsoft.com/about/csr/environment/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://sustainability.fb.com</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Impact of Response Latency on User Behavior in Web Search</title>
		<author>
			<persName><forename type="first">I</forename><surname>Arapakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">B</forename><surname>Cambazoglu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. SIGIR. pp</title>
				<meeting>SIGIR. pp<address><addrLine>, Queensland, Australia</addrLine></address></meeting>
		<imprint>
			<publisher>Gold Coast</publisher>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="103" to="112" />
		</imprint>
	</monogr>
	<note>ACM</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The Datacenter as a Computer: an Introduction to the Design of Warehouse-scale Machines</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Barroso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Clidaras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Hölzle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Synthesis lectures on computer architecture</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="1" to="154" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Exploiting Green Energy to Reduce the Operational Costs of Multi-Center Web Search Engines</title>
		<author>
			<persName><forename type="first">R</forename><surname>Blanco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Catena</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tonellotto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. WWW</title>
				<editor>
			<persName><surname>Iw3c2</surname></persName>
		</editor>
		<meeting>WWW<address><addrLine>Montreal, Quebec, Canada</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="1237" to="1247" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">CPU frequency and voltage scaling code in the Linux kernel</title>
		<author>
			<persName><forename type="first">D</forename><surname>Brodowski</surname></persName>
		</author>
		<ptr target="https://www.kernel.org/doc/Documentation/cpu-freq/index.txt" />
		<imprint>
			<date type="published" when="2015">2015. 2016-11-08</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Scalability Challenges in Web Search Engines</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">B</forename><surname>Cambazoglu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Baeza-Yates</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Synthesis Lectures on Information Concept</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="1" to="138" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Load-sensitive CPU Power Management for Web Search Engines</title>
		<author>
			<persName><forename type="first">M</forename><surname>Catena</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Macdonald</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tonellotto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. SIGIR</title>
				<meeting>SIGIR<address><addrLine>Santiago, Chile</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="751" to="754" />
		</imprint>
	</monogr>
	<note>ACM</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Energy-efficient Query Processing in Web Search Engines</title>
		<author>
			<persName><forename type="first">M</forename><surname>Catena</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tonellotto</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The Cost of a Cloud: Research Problems in Data Center Networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Greenberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hamilton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Maltz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Patel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SIGCOMM Computer Commununication Review</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="68" to="73" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Learning to Predict Response Times for Online Query Scheduling</title>
		<author>
			<persName><forename type="first">C</forename><surname>Macdonald</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tonellotto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Ounis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. SIGIR</title>
				<meeting>SIGIR<address><addrLine>Portland, Oregon, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="621" to="630" />
		</imprint>
	</monogr>
	<note>ACM</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Cutting the Electric Bill for Internet-scale Systems</title>
		<author>
			<persName><forename type="first">A</forename><surname>Qureshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Weber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Balakrishnan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Guttag</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Maggs</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. SIGCOMM</title>
				<meeting>SIGCOMM<address><addrLine>Barcelona, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="123" to="134" />
		</imprint>
	</monogr>
	<note>ACM</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Performance Related Changes and their User Impact</title>
		<author>
			<persName><forename type="first">E</forename><surname>Schurman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Brutlag</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. Velocity</title>
				<editor>
			<persName><surname>O'reilly</surname></persName>
		</editor>
		<meeting>Velocity<address><addrLine>San Jose, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Power Management and Dynamic Voltage Scaling: Myths and Facts</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">C</forename><surname>Snowdon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ruocco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Heiser</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. PARC workshop at EMSoft. IEEE</title>
				<meeting>PARC workshop at EMSoft. IEEE</meeting>
		<imprint>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<ptr target="http://gesi.org/files/Reports/Smart%202020%20report%20in%20English.pdf" />
		<title level="m">The Climate Group for the Global e-Sustainability Initiative: Smart 2020: Enabling the low carbon economy in the information age</title>
				<meeting><address><addrLine>visited</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2008">2008. 2016-11-04</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<ptr target="https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt" />
		<title level="m">The Linux Kernel Archives: Intel P-State driver</title>
				<imprint>
			<date type="published" when="2016-11-08">2016. 2016-11-08</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
