<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">≡ Java mod JVM -On the Performance Characteristics of Scala Programs on the Java Virtual Machine</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Andreas</forename><surname>Sewe</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Software Techology Group</orgName>
								<orgName type="institution">Technische Universität Darmstadt</orgName>
								<address>
									<settlement>Darmstadt</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">≡ Java mod JVM -On the Performance Characteristics of Scala Programs on the Java Virtual Machine</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">8FC6B8229DBF9CDA645D02D0D9201188</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T02:07+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In recent years, the Java Virtual Machine has become an attractive target for a multitude of programming languages, one of which is Scala. But while the Scala compiler emits plain Java bytecode, the performance characteristics of Scala programs are not necessarily similar to those of Java programs. We therefore propose to complement a popular Java benchmark suite with several Scala programs and to subsequently evaluate their performance using VM-independent metrics.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>While originally conceived as target of the Java programming language only, the Java Virtual Machine (JVM) <ref type="bibr" target="#b0">[1]</ref> has since become a target for hundreds of programming languages, the most prominent ones arguably being Clojure, Groovy, Jython, JRuby, and Scala. The JVM can therefore rightly be considered a Joint Virtual Machine.</p><p>Targeting such a joint virtual machine offers a number of engineering benefits to language implementers: After more than 15 years of research and development the Java platform is very mature. Moreover, it is not only mature but portable, widespread, and offers a staggering amount of libraries to choose from. Last but not least, the platform is backed by several high-performance JVMs. Alas, simply targeting the JVM does not always result in performance as good as Java's; existing JVMs are primarily tuned with respect to the performance characteristics of Java programs.</p><p>Of the five languages mentioned above, four languages share one key characteristic: Clojure, Groovy, Python, and Ruby are all dynamically typed. As this single language feature has been identified as the biggest performance bottleneck, the Java Community Process has put forth a specification request (JSR 292) to "[Support] Dy-PPPJ'10 WiP Poster Abstract namically Typed Languages on the Java TM Platform," i.e., to close the semantic gap between dynamically-typed source languages and Java bytecode.</p><p>While a semantic gap undoubtedly exists for statically-typed source languages like Scala <ref type="bibr" target="#b1">[2]</ref> as well, it is less clear what the bottlenecks are. This work-in-progress therefore aims to shed light on the performance characteristics of Scala programs. In particular, we will answer the following three questions: Are the performance characteristics of Scala programs, from the JVM's perspective, similar or dissimilar to those of Java programs? If they are dissimilar, what are the assumptions that implementers of a JVM have to reconsider? And are Scala programs sufficiently different to warrant special treatment-as the dynamically-typed languages now receive?</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Characterising the Performance of Scala Programs</head><p>Previous investigations into the performance of Scala programs have been mostly restricted to micro-benchmarking. <ref type="foot" target="#foot_0">1</ref> While such benchmarks are undeniably useful to the implementers of the Scala compiler, who have to decide between different code generation strategies for a given language feature, they are less useful to implementers of a Java VM, who have to deliver good performance across a wide range of real-world programs, only some of which are written in Scala. Our research will therefore assume the latter's viewpoint, in turn making the following contributions:</p><p>1. A benchmark suite of Scala programs developed as an extension to the popular DaCapo benchmark suite <ref type="bibr" target="#b2">[3]</ref>.</p><p>2. The definition of VM-independent metrics to characterise the performance of Scala programs.</p><p>3. A VM-independent comparison of the performance characteristics of Scala programs and Java programs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Towards a Scala Benchmark Suite</head><p>The following programs (along with potential input data) have been selected for inclusion in the benchmark suite. As of October 2010, half of the implementations are stable (marked †); Figure <ref type="figure" target="#fig_0">1</ref> on page 3 relates their size to the DaCapo benchmarks'.</p><p>kiama † The Kiama library for language processing (compiling and interpreting the Obr and ISWIM languages, respectively).</p><p>lift The Lift web framework (running its example application).</p><p>scalac † The "New" Scala compiler (compiling and optimising the Scalaz library).</p><p>scalap † A Scala classfile disassembler (disassembling a complex classfile).</p><p>scalatest ScalaTest, a testing framework supporting various testing styles, including JUnit and TestNG integrations (running its own test suite).</p><p>specs † Specs, another testing framework, which makes heavy use of embedded domain-specific languages (running its own test suite).</p><p>tmt The Stanford Topic Modeling Toolbox, a natural language processing framework driven by Scala scripts (learning a model using Latent Dirichlet Allocation). A few of the above benchmarks incorporate a significant amount of code written not in Scala but in plain Java. This choice is deliberate, as it reflects current practice; candidates either employ Scala facades to Java libraries (scalatest, specs) or run on an infrastructure written entirely in Java (lift). The following </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Towards VM-Independent Benchmark Comparisons</head><p>Possible metrics to compare benchmarks in a VM-independent fashion are based on object demographics or the structure of the static and dynamic call graphs. Hereby, metrics based on object demographics have been used extensively to characterise the DaCapo benchmarks <ref type="bibr" target="#b2">[3]</ref>; thus, we will sketch a few metrics of the latter group below. Two of the most effective optimisations a JVM can perform are adaptive recompilation and method inlining. Just how effective these optimisations are is determined, to a large degree, by the program's weighted dynamic call graph; the larger the weight of a vertex, the more profitable is recompiling the corresponding method; the larger the weight of an edge, the more profitable is inlining the corresponding call. Each of these optimisations, however, comes at a cost. Any dynamic metric must thus be related to a static metric which reflects the cost of performing said optimisations. In either case, it is essential for the purpose of our study to discern the influence of code written in Scala from code written in Java within the same benchmark program.</p><p>One metric of particular interest is the number of tail-calls which Scala programs exhibit. While the JVM does not yet support the notion of hard tail calls and thus will not guarantee tail-call optimisation, such optimisations are often assumed to be necessary to fully support functional languages on the JVM. The degree to which tail-calls are used in the aforementioned benchmarks determines whether such an optimisation would also be beneficial to existing programs, whether written in Scala or Java. In particular, this metric would shed some light on the Scala compiler's effectiveness in eliminating tail-calls (cf. Section 3.1).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Future Directions</head><p>In the following we will outline a few research directions into which we will embark once the above contributions have been made.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Optimising Compiler vs. Optimising VM</head><p>The semantic gap between Scala source code and Java bytecode is wider than the gap between Java source and bytecode. It is therefore likely that the peculiar nature of the bytecode derived from Scala sources inhibits some of the optimisations a production JVM will perform on Java programs.</p><p>The Scala compiler scalac is thus able to perform several optimisations on its own: method inlining, escape analysis (for closure elimination), and tail call optimisation. All these optimisations have traditionally been the domain of the JVM. Working offline, however, the compiler can spend considerably more time optimising. It does not have access to online profiles, though. The key question is thus whether the semantic gap is wide enough to warrant the re-implementation of optimisations within the compiler or whether the VM is the proper place for these optimisations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">JVM vs. Common Language Runtime</head><p>Scala targets a second platform besides the JVM, namely the Common Language Runtime (CLR). This gives rise to further questions: Do the answers to the above questions carry over to the CLR? If so, what makes such a generalisation possible?</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The size and complexity of 15 benchmark programs (excluding harness) written in Java ( ) and Scala ( ), respectively.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>table summarises this for a selection of Scala benchmarks.</figDesc><table><row><cell>Benchmark</cell><cell></cell><cell># Method Calls</cell><cell></cell></row><row><cell></cell><cell cols="2">Java JRE Java (other)</cell><cell>Scala</cell></row><row><cell>scalac  †</cell><cell>7.29%</cell><cell>0.22%</cell><cell>92.49%</cell></row><row><cell>scalap  †</cell><cell>29.83%</cell><cell>0.04%</cell><cell>70.13%</cell></row><row><cell>specs  †</cell><cell>89.99%</cell><cell>0.06%</cell><cell>9.95%</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">The language's implementers themselves perform a number of so-called shoot-outs, each testing a particular language feature: http://www.scala-lang.org/node/360.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>Thanks go to the entire team behind the DaCapo benchmark suite, who have provided us with a rock-solid foundation to work on.</p><p>This work was supported by CASED (www.cased.de).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">The Java Virtual Machine Specification</title>
		<author>
			<persName><forename type="first">Tim</forename><surname>Lindholm</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Frank</forename><surname>Yellin</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1999">1999</date>
			<publisher>Addison-Wesley</publisher>
		</imprint>
	</monogr>
	<note>2nd edition</note>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Programming in Scala</title>
		<author>
			<persName><forename type="first">Martin</forename><surname>Odersky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lex</forename><surname>Spoon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bill</forename><surname>Venners</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2008">2008</date>
			<publisher>Artima Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The DaCapo benchmarks: Java benchmarking development and analysis</title>
		<author>
			<persName><forename type="first">M</forename><surname>Stephen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Robin</forename><surname>Blackburn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chris</forename><surname>Garner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Asjad</forename><forename type="middle">M</forename><surname>Hoffmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kathryn</forename><forename type="middle">S</forename><surname>Khang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Rotem</forename><surname>Mckinley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Amer</forename><surname>Bentzur</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daniel</forename><surname>Diwan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Daniel</forename><surname>Feinberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Samuel</forename><forename type="middle">Z</forename><surname>Frampton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Martin</forename><surname>Guyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Antony</forename><surname>Hirzel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Maria</forename><surname>Hosking</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Han</forename><surname>Jump</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="middle">B</forename><surname>Eliot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Moss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Aashish</forename><surname>Moss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Darko</forename><surname>Phansalkar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Thomas</forename><surname>Stefanović</surname></persName>
		</author>
		<author>
			<persName><surname>Van-Drunen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ben</forename><surname>Daniel Von Dincklage</surname></persName>
		</author>
		<author>
			<persName><surname>Wiedermann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 21st Conference on Object-Oriented Programming Systems, Languages, and Applications</title>
				<meeting>the 21st Conference on Object-Oriented Programming Systems, Languages, and Applications<address><addrLine>Portland, Oregon, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="169" to="190" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
