<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Birgit</forename><surname>Hofer</surname></persName>
							<email>bhofer@ist.tugraz.at</email>
						</author>
						<author>
							<persName><forename type="first">Dietmar</forename><surname>Jannach</surname></persName>
							<email>dietmar.jannach@udo.edu</email>
						</author>
						<author>
							<persName><forename type="first">Thomas</forename><surname>Schmitz</surname></persName>
							<email>thomas.schmitz@udo.edu</email>
						</author>
						<author>
							<persName><forename type="first">Franz</forename><surname>Wotawa</surname></persName>
							<email>wotawa@ist.tugraz.at</email>
						</author>
						<author>
							<affiliation key="aff0">
								<orgName type="institution">Graz University of Technology</orgName>
								<address>
									<postCode>8010</postCode>
									<settlement>Graz</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="institution">TU Dortmund</orgName>
								<address>
									<postCode>44221</postCode>
									<settlement>Dortmund</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="institution">TU Dortmund</orgName>
								<address>
									<postCode>44221</postCode>
									<settlement>Dortmund</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff3">
								<orgName type="institution">Kostyantyn Shchekotykhin University Klagenfurt</orgName>
								<address>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff4">
								<orgName type="institution">Graz University of Technology</orgName>
								<address>
									<postCode>8010</postCode>
									<settlement>Graz</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">59A35941B7636C728F87B431DC7C3651</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T17:28+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>H.4.1 [Information Systems Applications]: Office Automation-Spreadsheets; D.2.5 [Software Engineering]: Testing and Debugging-Debugging aids Spreadsheets</term>
					<term>Debugging</term>
					<term>Fault Localization</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In recent years, researchers have developed a number of techniques to assist the user in locating a fault within a spreadsheet. The evaluation of these approaches is often based on spreadsheets into which artificial errors are injected. In this position paper, we summarize different shortcomings of these forms of evaluations and sketch possible remedies including the development of a publicly available spreadsheet corpus for benchmarking as well as user and field studies to assess the true value of the proposed techniques.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>Locating the true causes why a given spreadsheet program does not compute the expected outcomes can be a tedious task. Over the last years, researchers have developed a number of methods supporting the user in the fault localization and correction (debugging) process. The techniques range from the visualization of suspicious cells or regions of the spreadsheet, and the application of known practices from software engineering like spectrum-based fault localization (SFL) or slicing, to declarative and constraint-based reasoning techniques <ref type="bibr" target="#b1">[1,</ref><ref type="bibr" target="#b3">3,</ref><ref type="bibr">6,</ref><ref type="bibr" target="#b7">7,</ref><ref type="bibr" target="#b9">9,</ref><ref type="bibr" target="#b11">11,</ref><ref type="bibr" target="#b12">12,</ref><ref type="bibr" target="#b16">16]</ref>.</p><p>However, there is a number of challenges common to all these approaches. Unlike other computer science sub-areas, such as natural language processing, information retrieval or automated planning and scheduling, no standard benchmarks exist for spreadsheet debugging methods. The absence of commonly used benchmarks prevents the direct comparison of spreadsheet debugging approaches. Furthermore, fault localization and debugging for spreadsheets require the design of a user-debugger interface. An important question in this context is: what input or interaction can realistically be expected from the user? Finally, the main question to be answered is whether or not automated de-bugging techniques actually help the developer as discussed in <ref type="bibr" target="#b14">[14]</ref> for imperative programs.</p><p>In this position paper, we discuss some limitations of the current research practice in the field and outline potential ways to improve the research practice in the future.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">LACK OF BENCHMARK PROBLEMS</head><p>To demonstrate the usefulness of a new debugging technique, we need spreadsheets containing faults. Since no public set of such spreadsheets exists, researchers often create their own suite of benchmark problems, e.g., by applying mutation operators to existing correct spreadsheets <ref type="bibr" target="#b2">[2]</ref>. Unfortunately, these problems are only rarely made publicly available. This makes a comparative evaluation of approaches difficult and it is often unclear if the proposed technique is applicable to a wider class of spreadsheets.</p><p>In some papers, spreadsheets from the EUSES corpus<ref type="foot" target="#foot_0">1</ref> are used for evaluations. As no information exists about the intended semantics of these spreadsheets, mutations are applied in order to obtain faulty versions of the spreadsheets. The spreadsheets in this corpus are however quite diverse, e.g., with respect to their size or the types of the used formulas. Often only a subset of the documents is used in the evaluations and the selection of the subset is not justified well. Even when the benchmark problems are publicly shared like the ones used in <ref type="bibr" target="#b10">[10]</ref>, they may have special characteristics that are advantageous for a certain method and, e.g., contain only one single fault or use only certain functions or cell data types.</p><p>A corpus of diverse benchmark problems is strongly needed for spreadsheet debugging to make different research approaches better comparable and to be able to identify shortcomings of existing approaches. Such a corpus could be incrementally built by researchers sharing their real-world and artificial benchmark problems. In addition, since it is not always clear if typical spreadsheet mutation operators truly correspond to mistakes developers make, insights and practices from the Information Systems field should be better integrated into our research. This in particular includes the use of spreadsheet construction exercises in laboratory settings that help us identify which kinds of mistakes users make and what their debugging strategies are, see, e.g., <ref type="bibr" target="#b4">[4]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">USABILITY AND USER ACCEPTANCE</head><p>Spreadsheet debugging research is often based on offline experimental designs, e.g., by measuring how many of the injected faults are successfully located with a given technique, see, e.g., <ref type="bibr" target="#b5">[5]</ref>. In some cases, plug-ins to spreadsheet environments are developed like in <ref type="bibr" target="#b1">[1]</ref> or <ref type="bibr" target="#b11">[11]</ref>. Similar to plug-ins used for other purposes, e.g., spreadsheet testing, the usability of these plug-ins for end users is seldom in the focus of the research. The proposed plug-ins typically require various types of input from the user at different stages of the debugging process. Some of these inputs have to be provided at the beginning of the process and some can be requested by the debugger during fault localization. Typical inputs of a debugger include statements about the correctness of values/formulas in individual cells <ref type="bibr" target="#b10">[10]</ref>, information about expected values for certain cells <ref type="bibr" target="#b1">[1,</ref><ref type="bibr" target="#b3">3]</ref>, specification of multiple test cases <ref type="bibr" target="#b11">[11]</ref>, etc.</p><p>In many cases, it remains unclear, if an average spreadsheet developer will be willing or able to provide these inputs since concepts like test cases do not exist in the spreadsheet paradigm. Therefore, researchers have to ensure that a developer interprets the requests from the debugger correctly and provides appropriate inputs as expected by the debugger. One additional problem in that context is that user inputs, e.g., the test case specifications, are usually considered to be reliable and most existing approaches have no built-in means to deal with errors in the inputs.</p><p>Overall, we argue that offline experimental evaluations should be paired with user studies whenever possible as done, e.g., in <ref type="bibr" target="#b8">[8]</ref> or <ref type="bibr" target="#b11">[11]</ref>. Such studies should help us validate whether our approaches are based on realistic assumptions and are acceptable at least for ambitious users after some training. At the same time, observations of the users' behavior during debugging can be used to learn about their problem solving strategies and to evaluate whether the tool actually helped to find a fault.</p><p>Again, insights and practices both from the fields of Information Systems and Human Computer Interaction should be the basis for these forms of experiments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">FIELD RESEARCH</head><p>In addition to user studies in laboratory environments, research on real spreadsheets as suggested in <ref type="bibr" target="#b15">[15]</ref> is required to determine potential differences between the experimental usage of the proposed debugging methods and the everyday use of such tools in companies or institutes. Error rates and types found in practice could differ from what is observed in user studies whose participants in many cases are students. In <ref type="bibr" target="#b13">[13]</ref>, e.g., a construction exercise with business managers was done to determine error rates. In addition, the user acceptance of fault localization tools could vary strongly because of different expectations of professional users with respect to the utilized tools. To ensure the usability for real users, existing spreadsheets can be examined and questionnaires with users can be made, as done, e.g., in <ref type="bibr" target="#b7">[7]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">CONCLUSIONS</head><p>A number of proposals have been made in the recent literature to assist the user in the process of locating faults in a given spreadsheet. In this position paper, we have identified some limitations of current research practice regarding the comparability and reproducibility of the results. As possi-ble remedies to these shortcomings we advocate the development of a corpus of benchmark problems and the increased adoption of user studies of various types as an evaluation instrument. As experimental settings differ from real-life, we additionally propose to use field studies to obtain insights on how debugging methods are used in companies.</p></div>			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://esquared.unl.edu/wikka.php?wakka= EUSESSpreadsheetCorpus</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title/>
		<author>
			<persName><surname>References</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">GoalDebug: A Spreadsheet Debugger for End Users</title>
		<author>
			<persName><forename type="first">R</forename><surname>Abraham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Erwig</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. ICSE 2007</title>
				<meeting>ICSE 2007</meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="251" to="260" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Mutation Operators for Spreadsheets</title>
		<author>
			<persName><forename type="first">R</forename><surname>Abraham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Erwig</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Trans. on Softw. Eng</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="94" to="108" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Constraint-based debugging of spreadsheets</title>
		<author>
			<persName><forename type="first">R</forename><surname>Abreu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Riboira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wotawa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. CibSE&apos;12</title>
				<meeting>CibSE&apos;12</meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="1" to="14" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">An Experimental Study of People Creating Spreadsheets</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Gould</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM TOIS</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="258" to="272" />
			<date type="published" when="1987">1987</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Automatic Detection of Dimension Errors in Spreadsheets</title>
		<author>
			<persName><forename type="first">C</forename><surname>Chambers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Erwig</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Vis. Lang. &amp; Comp</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="269" to="283" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Towards a catalog of spreadsheet smells</title>
		<author>
			<persName><forename type="first">J</forename><surname>Cunha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A P</forename><surname>Fernandes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ribeiro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Saraiva</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. ICCSA&apos;12</title>
				<meeting>ICCSA&apos;12</meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="202" to="216" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Supporting Professional Spreadsheet Users by Generating Leveled Dataflow Diagrams</title>
		<author>
			<persName><forename type="first">F</forename><surname>Hermans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pinzger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Van Deursen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. ICSE 2011</title>
				<meeting>ICSE 2011</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="451" to="460" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Detecting and Visualizing Inter-Worksheet Smells in Spreadsheets</title>
		<author>
			<persName><forename type="first">F</forename><surname>Hermans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pinzger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Van Deursen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICSE 2012</title>
				<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="441" to="451" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Detecting Code Smells in Spreadsheet Formulas</title>
		<author>
			<persName><forename type="first">F</forename><surname>Hermans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pinzger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Van Deursen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. ICSM 2012</title>
				<meeting>ICSM 2012</meeting>
		<imprint>
			<date type="published" when="2012">2012</date>
			<biblScope unit="page" from="409" to="418" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">On the Empirical Evaluation of Fault Localization Techniques for Spreadsheets</title>
		<author>
			<persName><forename type="first">B</forename><surname>Hofer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Riboira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wotawa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Abreu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Getzner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. FASE 2013</title>
				<meeting>FASE 2013</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="68" to="82" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Model-based diagnosis of spreadsheet programs -A constraint-based debugging approach</title>
		<author>
			<persName><forename type="first">D</forename><surname>Jannach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Schmitz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Autom. Softw. Eng</title>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note>to appear</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Avoiding, finding and fixing spreadsheet errors -a survey of automated approaches for spreadsheet QA</title>
		<author>
			<persName><forename type="first">D</forename><surname>Jannach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Schmitz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Hofer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wotawa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Systems and Software</title>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
	<note>to appear</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Using two heads in practice</title>
		<author>
			<persName><forename type="first">F</forename><surname>Karlsson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. WEUSE 2008</title>
				<meeting>WEUSE 2008</meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="43" to="47" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Are Automated Debugging Techniques Actually Helping Programmers?</title>
		<author>
			<persName><forename type="first">C</forename><surname>Parnin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Orso</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. ISSTA 2011</title>
				<meeting>ISSTA 2011</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="199" to="209" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">A critical review of the literature on spreadsheet errors</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">G</forename><surname>Powell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">R</forename><surname>Baker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Lawson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Decision Support Systems</title>
		<imprint>
			<biblScope unit="volume">46</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="128" to="138" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Slicing Spreadsheets: An Integrated Methodology for Spreadsheet Testing and Debugging</title>
		<author>
			<persName><forename type="first">J</forename><surname>Reichwein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Rothermel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Burnett</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. DSL</title>
				<meeting>DSL</meeting>
		<imprint>
			<date type="published" when="1999">1999. 1999</date>
			<biblScope unit="page" from="25" to="38" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
