=Paper=
{{Paper
|id=Vol-1209/paper8
|storemode=property
|title=End-user Development via Sheet-defined Functions
|pdfUrl=https://ceur-ws.org/Vol-1209/paper_8.pdf
|volume=Vol-1209
}}
==End-user Development via Sheet-defined Functions==
End-user development via sheet-defined functions
∗
Peter Sestoft Jonas Druedahl Rask Simon Eikeland
Timmermann
ABSTRACT
We have developed an implementation of sheet-defined func-
tions, a mechanism that allows spreadsheet users to define
their own functions, using only spreadsheet concepts such as
cells, formulas and references, and no external programming
languages. This position paper presents the motivation and
vision of this work, describes the features of our prototype
implementation, and outlines future work.
Keywords
Spreadsheets, end-user development, functional programming Figure 1: A sheet-defined function implementing
Excel’s NOMINAL built-in. Cells B23 and B24 are in-
1. INTRODUCTION put cells, cell B26 is the output cell, and B25 holds
Spreadsheet programs such as Microsoft Excel, OpenOffice an intermediate result. The call to DEFINE in cell A22
Calc, Gnumeric and Google Docs are used by millions of creates the function. Cell A28 contains a call to the
people to develop and maintain complex models in finance, defined function. It takes around 200 ns to execute
science, engineering and administration. Yet Peyton Jones, it, of which 80 ns is due to exponentiation (ˆ). As
Burnett and Blackwell [9] observed that spreadsheet pro- shown in cell A28, a sheet-defined function is called
grams lack even the most basic abstraction facility — a way just like a built-in or VBA function.
to encapsulate an expression as a reusable function — and
proposed a design for such a mechanism.
nisms, saying about the traditional combination of Excel
We have implemented this idea in the form of sheet-defined and externally defined functions that “change control, static
functions. A user may define a function F simply by declar- type checking, abstraction and reuse are almost completely
ing which (input) cells will hold F’s formal parameters and lacking” [1].
which (output) cell will compute F’s result, as a function of
the input cells. The function definition may involve arbi- 2. THE VISION
trary additional cells, spreadsheet formulas, calls to built-in The ultimate goal of this work is to allow spreadsheet users
functions, and calls to other sheet-defined functions. Fig- themselves to develop and evolve libraries of user-defined
ure 1 shows an example. In the example, values are referred functions to support sophisticated spreadsheet models. Defin-
to by cell (such as B25). A mechanism that allows for sym- ing a function requires only well-known spreadsheet concepts
bolic names (such as “periods”) instead could be added, but such as cell, cell reference and function, and no external pro-
Nardi speculates that end user developers would not neces- gramming languages. Therefore experimentation and adap-
sarily find that better [7, page 44]. tation of user-defined functions remain under the control of
the spreadsheet users and domain experts, who need not
Augustsson et al. from Standard Chartered Bank provide wait for an IT department to understand, describe, imple-
further support for the utility of such abstraction mecha- ment and test the desired changes.
∗sestoft@itu.dk, IT University of Copenhagen, Denmark
Any spreadsheet computation can be turned into a sheet-
defined function. This ensures conceptual and notational
simplicity. Moreover, it means that new user-defined func-
tions may arise by refactoring of a spreadsheet model as
it evolves. As a spreadsheet model becomes more refined
and complex, it may be observed that the same cluster of
formulas appears again and again. Such a cluster of formu-
las may then be encapsulated in a sheet-defined function,
and each formula cluster replaced by a call to that function.
This both simplifies the spreadsheet model and improves its
Table 1: Time to compute the cumulative distribu- Table 2: Time to call a square root function; includes
tion function of the normal distribution N (0, 1). recalculation time.
Implementation Time/call (ns) Calling Time/call (ns)
Sheet-defined function 118 Sheet-defined function from Funcalc 400
C# 47 Excel built-in from Excel 160
C (gcc 4.2.1 -O3) 51 .NET function from Excel/Excel-DNA 4,900
Excel 2007 VBA function 1925 VBA function from Excel 12,000
Excel 2007 built-in NORMSDIST 993
value may be called as APPLY(A42,0.053543,4) using built-
maintainability, because a bug-fix or other improvement to in function APPLY.
the function will automatically affect all its uses, unlike the
traditional situation when there are multiple copies of the Function values are built by applying a sheet-defined func-
same cluster of formulas. tion to only some of its arguments, the absent arguments
being given as NA(); the resulting function value will dis-
Sheet-defined functions may be shared with other users in play as NOMINAL(#NA,4) or similar.
the same department or application domain, without pre-
venting them from making their own improvements — be- Such a function value may be specialized, or partially eval-
cause the domain knowledge is not locked into the notation uated, with respect to its available (non-#NA) arguments.
of a “real” programming language, but one that presumably The result is a new function value with the same behavior
is familiar to users and that they are (more) comfortable but potentially better performance because the available ar-
experimenting with. gument values have been inlined and loops unrolled in the
underlying bytecode. For more information, see [5] and [12].
Sheet-defined functions support end-user “tinkering” to de- Specialization provides some amount of incremental compu-
velop models and workflows that are appropriate within tation and memoization, and we do not currently have other
their application domain [7]. Clearly not all spreadsheet general mechanisms for these purposes.
users will be equally competent developers of sheet-defined
functions, and clearly not all software should be developed A forthcoming book [13] gives many more details of the im-
in this way. However, judging from the huge popularity plementation, more examples of sheet-defined functions, and
of spreadsheets within banks, finance, management, science a manual for Funcalc. A previously published paper [15]
and engineering, the immediate response and the user con- presents a case study of reimplementing Excel’s built-in fi-
trol offered by spreadsheets are attractive features. Also, nancial functions as sheet-defined functions.
from anecdotal evidence, structured use of spreadsheets is a
flexible, fast and cheap alternative to “big bang” professional A comprehensive list of US spreadsheet patents is given in
IT projects. a forthcoming report [14].
3. THE FUNCALC PROTOTYPE 4. INTEGRATION WITH EXCEL
We have created a prototype implementation of sheet-defined In ongoing work [10] we integrate sheet-defined functions
functions, called Funcalc. The implementation is written in with the widely used Microsoft Excel spreadsheet program,
C#, is quite compact (12,000 lines of code) and compiles rather than our Funcalc prototype, as illustrated in Figures 2
sheet-defined functions to .NET bytecode [3] at run-time. and 3. This enables large-scale experimentation with sheet-
As shown by Table 1 execution efficiency is very good; this defined functions because they can be defined in a context
is due both to local optimizations performed by our function that is familiar to spreadsheet users and provides charting,
compiler and to Microsoft’s considerable engineering effort auditing, and so on.
in the underlying .NET just-in-time compiler.
The main downside is that calling a sheet-defined function
Funcalc features include: from Excel is much slower than from the Funcalc implemen-
tation (yet apparently faster than calling a VBA function);
see Table 2. However, the sheet-defined function itself will
• a “normal” interpretive spreadsheet implementation; execute at the same speed as in Funcalc. This work uses the
• a compiled implementation of sheet-defined functions; Excel-DNA runtime bridge between Excel and .NET [4].
• recursive functions and higher-order functions; 5. FUTURE WORK
• functions can accept and return array values in addi- So far we have focused mostly on functionality and good
tion to numbers and string; performance. We emphasize performance because we want
sheet-defined functions to replace not only user-defined func-
• automatic specialization, or partial evaluation [12]; tions written in VBA, C++ and other external languages,
but to replace built-in functions also. Domain experts in
• facilities for benchmarking sheet-defined functions.
finance, statistics and other areas of rather general interest
should be able to develop well-performing high-quality func-
Because Funcalc supports higher-order functions, the value tions themselves and not have to rely on Microsoft or other
contained in a cell, say A42, may be a function value. This vendors to do so.
Figure 2: Funcalc as Excel plug-in, showing formulas of sheet-defined function TRIAREA with input cells A3,
B3 and C3, intermediate cell D3, and output cell E3. The call to DEFINE in cell E4 creates the function.
Through the new “Excelcalc” menu one can interact with the underlying Funcalc implementation and the
Excel-Funcalc bridge (mostly for development purposes).
Figure 3: Same sheet as in Figure 2, here showing values rather than formulas. Note the editing in progress
of a call to sheet-defined function TRIAREA in cell E6.
However, a well-performing implementation of sheet-defined [7] B. A. Nardi A small matter of programming.
functions is just the beginning: one should investigate ad- Perspectives on end user programming. MIT Press,
ditional infrastructure and practices to support their use. 1993.
For instance, how can we extend the natural collaboration [8] B. A. Nardi and J. R. Miller Twinkling lights and
around spreadsheet development [8] in a community of users nested loops: distributed problem solving and
to cover also libraries of sheet-defined functions; how can we spreadsheet development. International Journal of
support versioning and merging of such libraries in a way Man-Machine Studies, 34:161–184, 1991.
that does not preclude individual users’ tinkering and ex- [9] S. Peyton Jones, A. Blackwell, and M. Burnett. A
perimentation; how can we support systematic testing; and user-centred approach to functions in Excel. In ICFP
so on. ’03: Proceedings of the Eighth ACM SIGPLAN
International Conference on Functional Programming,
Our concept of sheet-defined functions should be subjected pages 165–176. ACM, 2003.
to a systematic usability study; the study conducted by [10] J. D. Rask and S. E. Timmermann. Integration of
Peyton-Jones, Blackwell and Burnett [9] assumed that func- sheet-defined functions in Excel using C#. Master’s
tions could not be recursive, whereas ours can. thesis, IT University of Copenhagen, 2014. (Expected
June 2014).
Finally, sheet-defined functions lend themselves well to par- [11] V. Sarkar and J. Hennessy. Compile-time partitioning
allelization, because they are pure (yet strict, an unusual and scheduling of parallel programs. In ACM
combination) so that computations can be reordered and SIGPLAN ’86 Symposium on Compiler Construction,
performed speculatively, and often exhibit considerable ex- pages 17–26, June 1986.
plicit parallelism. In fact, they resemble dataflow languages
[12] P. Sestoft. Online partial evaluation of sheet-defined
such as Sisal [6]. Presumably some of the 1980es techniques
functions. In A. Banerjee, O. Danvy, K. Doh, and
used to schedule dataflow languages [11] could be used to
J. Hatcliff, editors, Semantics, Abstract Interpretation,
perform spreadsheet computations efficiently on modern mul-
and Reasoning about Programs, volume 129 of
ticore machines. The result might be “supercomputing for
Electronic Proceedings in Theoretical Computer
the masses”, realizing Chandy’s 1984 vision [2].
Science, pages 136–160, 2013.
[13] P. Sestoft. Spreadsheet Implementation Technology.
6. CONCLUSION Basics and Extensions. MIT Press, 2014. ISBN
We have presented a prototype implementation of sheet- 978-0-262-52664-7. (Expected August 2014). 313
defined functions and outlined some future work. Our hope pages.
is that such functionality will become available in widely [14] P. Sestoft. Spreadsheet patents. Technical Report
used spreadsheet programs, or via a full-featured version of ITU-TR-2014-178, IT University of Copenhagen, 2014.
the plugin described in Section 4, and will enable spread- ISBN 978-87-7949-317-9. (To appear).
sheet users to develop their own computational models into [15] P. Sestoft and J. Z. Sørensen. Sheet-defined functions:
reusable function libraries, without loss of computational implementation and initial evaluation. In Y. Dittrich
efficiency and without handing over control to remote IT et al., editors, International Symposium on End-User
departments or software contractors. Moreover, there seems Development, June 2013, volume 7897 of Lecture
to be a technological opportunity to harness the power of Notes in Computer Science, pages 88–103, 2013.
multicore machines through spreadsheet programming.
7. REFERENCES
[1] L. Augustsson, H. Mansell, and G. Sittampalam.
Paradise: A two-stage DSL embedded in Haskell. In
International Conference on Functional Programming
(ICFP’08), pages 225–228. ACM, September 2008.
[2] M. Chandy. Concurrent programming for the masses.
(PODC 1984 invited address). In Principles of
Distributed Computing 1985, pages 1–12. ACM, 1985.
[3] Ecma TC39 TG3. Common Language Infrastructure
(CLI). Standard ECMA-335. Ecma International,
sixth edition, June 2012.
[4] Excel DNA Project. Homepage. At
http://exceldna.codeplex.com/ on 28 February 2014.
[5] N. D. Jones, C. Gomard, and P. Sestoft. Partial
Evaluation and
Automatic Program Generation. Prentice Hall, 1993. At
http://www.itu.dk/people/sestoft/pebook/pebook.html
on 9 June 2013.
[6] J. McGraw et al. Sisal. Streams and iteration in a
single assignment language. Language reference
manual, version 1.2. Technical report, Lawrence
Livermore National Labs, March 1985.