=Paper= {{Paper |id=Vol-1209/paper8 |storemode=property |title=End-user Development via Sheet-defined Functions |pdfUrl=https://ceur-ws.org/Vol-1209/paper_8.pdf |volume=Vol-1209 }} ==End-user Development via Sheet-defined Functions== https://ceur-ws.org/Vol-1209/paper_8.pdf
         End-user development via sheet-defined functions

                                   ∗
                 Peter Sestoft                     Jonas Druedahl Rask                    Simon Eikeland
                                                                                           Timmermann




ABSTRACT
We have developed an implementation of sheet-defined func-
tions, a mechanism that allows spreadsheet users to define
their own functions, using only spreadsheet concepts such as
cells, formulas and references, and no external programming
languages. This position paper presents the motivation and
vision of this work, describes the features of our prototype
implementation, and outlines future work.

Keywords
Spreadsheets, end-user development, functional programming        Figure 1: A sheet-defined function implementing
                                                                  Excel’s NOMINAL built-in. Cells B23 and B24 are in-
1.   INTRODUCTION                                                 put cells, cell B26 is the output cell, and B25 holds
Spreadsheet programs such as Microsoft Excel, OpenOffice          an intermediate result. The call to DEFINE in cell A22
Calc, Gnumeric and Google Docs are used by millions of            creates the function. Cell A28 contains a call to the
people to develop and maintain complex models in finance,         defined function. It takes around 200 ns to execute
science, engineering and administration. Yet Peyton Jones,        it, of which 80 ns is due to exponentiation (ˆ). As
Burnett and Blackwell [9] observed that spreadsheet pro-          shown in cell A28, a sheet-defined function is called
grams lack even the most basic abstraction facility — a way       just like a built-in or VBA function.
to encapsulate an expression as a reusable function — and
proposed a design for such a mechanism.
                                                                  nisms, saying about the traditional combination of Excel
We have implemented this idea in the form of sheet-defined        and externally defined functions that “change control, static
functions. A user may define a function F simply by declar-       type checking, abstraction and reuse are almost completely
ing which (input) cells will hold F’s formal parameters and       lacking” [1].
which (output) cell will compute F’s result, as a function of
the input cells. The function definition may involve arbi-        2.   THE VISION
trary additional cells, spreadsheet formulas, calls to built-in   The ultimate goal of this work is to allow spreadsheet users
functions, and calls to other sheet-defined functions. Fig-       themselves to develop and evolve libraries of user-defined
ure 1 shows an example. In the example, values are referred       functions to support sophisticated spreadsheet models. Defin-
to by cell (such as B25). A mechanism that allows for sym-        ing a function requires only well-known spreadsheet concepts
bolic names (such as “periods”) instead could be added, but       such as cell, cell reference and function, and no external pro-
Nardi speculates that end user developers would not neces-        gramming languages. Therefore experimentation and adap-
sarily find that better [7, page 44].                             tation of user-defined functions remain under the control of
                                                                  the spreadsheet users and domain experts, who need not
Augustsson et al. from Standard Chartered Bank provide            wait for an IT department to understand, describe, imple-
further support for the utility of such abstraction mecha-        ment and test the desired changes.
∗sestoft@itu.dk, IT University of Copenhagen, Denmark
                                                                  Any spreadsheet computation can be turned into a sheet-
                                                                  defined function. This ensures conceptual and notational
                                                                  simplicity. Moreover, it means that new user-defined func-
                                                                  tions may arise by refactoring of a spreadsheet model as
                                                                  it evolves. As a spreadsheet model becomes more refined
                                                                  and complex, it may be observed that the same cluster of
                                                                  formulas appears again and again. Such a cluster of formu-
                                                                  las may then be encapsulated in a sheet-defined function,
                                                                  and each formula cluster replaced by a call to that function.
                                                                  This both simplifies the spreadsheet model and improves its
Table 1: Time to compute the cumulative distribu-                 Table 2: Time to call a square root function; includes
tion function of the normal distribution N (0, 1).                recalculation time.
      Implementation                Time/call (ns)                  Calling                              Time/call (ns)
      Sheet-defined function                   118                  Sheet-defined function from Funcalc             400
      C#                                        47                  Excel built-in from Excel                       160
      C (gcc 4.2.1 -O3)                         51                  .NET function from Excel/Excel-DNA            4,900
      Excel 2007 VBA function                1925                   VBA function from Excel                      12,000
      Excel 2007 built-in NORMSDIST            993

                                                                  value may be called as APPLY(A42,0.053543,4) using built-
maintainability, because a bug-fix or other improvement to        in function APPLY.
the function will automatically affect all its uses, unlike the
traditional situation when there are multiple copies of the       Function values are built by applying a sheet-defined func-
same cluster of formulas.                                         tion to only some of its arguments, the absent arguments
                                                                  being given as NA(); the resulting function value will dis-
Sheet-defined functions may be shared with other users in         play as NOMINAL(#NA,4) or similar.
the same department or application domain, without pre-
venting them from making their own improvements — be-             Such a function value may be specialized, or partially eval-
cause the domain knowledge is not locked into the notation        uated, with respect to its available (non-#NA) arguments.
of a “real” programming language, but one that presumably         The result is a new function value with the same behavior
is familiar to users and that they are (more) comfortable         but potentially better performance because the available ar-
experimenting with.                                               gument values have been inlined and loops unrolled in the
                                                                  underlying bytecode. For more information, see [5] and [12].
Sheet-defined functions support end-user “tinkering” to de-       Specialization provides some amount of incremental compu-
velop models and workflows that are appropriate within            tation and memoization, and we do not currently have other
their application domain [7]. Clearly not all spreadsheet         general mechanisms for these purposes.
users will be equally competent developers of sheet-defined
functions, and clearly not all software should be developed       A forthcoming book [13] gives many more details of the im-
in this way. However, judging from the huge popularity            plementation, more examples of sheet-defined functions, and
of spreadsheets within banks, finance, management, science        a manual for Funcalc. A previously published paper [15]
and engineering, the immediate response and the user con-         presents a case study of reimplementing Excel’s built-in fi-
trol offered by spreadsheets are attractive features. Also,       nancial functions as sheet-defined functions.
from anecdotal evidence, structured use of spreadsheets is a
flexible, fast and cheap alternative to “big bang” professional   A comprehensive list of US spreadsheet patents is given in
IT projects.                                                      a forthcoming report [14].

3.    THE FUNCALC PROTOTYPE                                       4.   INTEGRATION WITH EXCEL
We have created a prototype implementation of sheet-defined       In ongoing work [10] we integrate sheet-defined functions
functions, called Funcalc. The implementation is written in       with the widely used Microsoft Excel spreadsheet program,
C#, is quite compact (12,000 lines of code) and compiles          rather than our Funcalc prototype, as illustrated in Figures 2
sheet-defined functions to .NET bytecode [3] at run-time.         and 3. This enables large-scale experimentation with sheet-
As shown by Table 1 execution efficiency is very good; this       defined functions because they can be defined in a context
is due both to local optimizations performed by our function      that is familiar to spreadsheet users and provides charting,
compiler and to Microsoft’s considerable engineering effort       auditing, and so on.
in the underlying .NET just-in-time compiler.
                                                                  The main downside is that calling a sheet-defined function
Funcalc features include:                                         from Excel is much slower than from the Funcalc implemen-
                                                                  tation (yet apparently faster than calling a VBA function);
                                                                  see Table 2. However, the sheet-defined function itself will
     • a “normal” interpretive spreadsheet implementation;        execute at the same speed as in Funcalc. This work uses the
     • a compiled implementation of sheet-defined functions;      Excel-DNA runtime bridge between Excel and .NET [4].

     • recursive functions and higher-order functions;            5.   FUTURE WORK
     • functions can accept and return array values in addi-      So far we have focused mostly on functionality and good
       tion to numbers and string;                                performance. We emphasize performance because we want
                                                                  sheet-defined functions to replace not only user-defined func-
     • automatic specialization, or partial evaluation [12];      tions written in VBA, C++ and other external languages,
                                                                  but to replace built-in functions also. Domain experts in
     • facilities for benchmarking sheet-defined functions.
                                                                  finance, statistics and other areas of rather general interest
                                                                  should be able to develop well-performing high-quality func-
Because Funcalc supports higher-order functions, the value        tions themselves and not have to rely on Microsoft or other
contained in a cell, say A42, may be a function value. This       vendors to do so.
Figure 2: Funcalc as Excel plug-in, showing formulas of sheet-defined function TRIAREA with input cells A3,
B3 and C3, intermediate cell D3, and output cell E3. The call to DEFINE in cell E4 creates the function.
Through the new “Excelcalc” menu one can interact with the underlying Funcalc implementation and the
Excel-Funcalc bridge (mostly for development purposes).




Figure 3: Same sheet as in Figure 2, here showing values rather than formulas. Note the editing in progress
of a call to sheet-defined function TRIAREA in cell E6.
However, a well-performing implementation of sheet-defined        [7] B. A. Nardi A small matter of programming.
functions is just the beginning: one should investigate ad-           Perspectives on end user programming. MIT Press,
ditional infrastructure and practices to support their use.           1993.
For instance, how can we extend the natural collaboration         [8] B. A. Nardi and J. R. Miller Twinkling lights and
around spreadsheet development [8] in a community of users            nested loops: distributed problem solving and
to cover also libraries of sheet-defined functions; how can we        spreadsheet development. International Journal of
support versioning and merging of such libraries in a way             Man-Machine Studies, 34:161–184, 1991.
that does not preclude individual users’ tinkering and ex-        [9] S. Peyton Jones, A. Blackwell, and M. Burnett. A
perimentation; how can we support systematic testing; and             user-centred approach to functions in Excel. In ICFP
so on.                                                                ’03: Proceedings of the Eighth ACM SIGPLAN
                                                                      International Conference on Functional Programming,
Our concept of sheet-defined functions should be subjected            pages 165–176. ACM, 2003.
to a systematic usability study; the study conducted by          [10] J. D. Rask and S. E. Timmermann. Integration of
Peyton-Jones, Blackwell and Burnett [9] assumed that func-            sheet-defined functions in Excel using C#. Master’s
tions could not be recursive, whereas ours can.                       thesis, IT University of Copenhagen, 2014. (Expected
                                                                      June 2014).
Finally, sheet-defined functions lend themselves well to par-    [11] V. Sarkar and J. Hennessy. Compile-time partitioning
allelization, because they are pure (yet strict, an unusual           and scheduling of parallel programs. In ACM
combination) so that computations can be reordered and                SIGPLAN ’86 Symposium on Compiler Construction,
performed speculatively, and often exhibit considerable ex-           pages 17–26, June 1986.
plicit parallelism. In fact, they resemble dataflow languages
                                                                 [12] P. Sestoft. Online partial evaluation of sheet-defined
such as Sisal [6]. Presumably some of the 1980es techniques
                                                                      functions. In A. Banerjee, O. Danvy, K. Doh, and
used to schedule dataflow languages [11] could be used to
                                                                      J. Hatcliff, editors, Semantics, Abstract Interpretation,
perform spreadsheet computations efficiently on modern mul-
                                                                      and Reasoning about Programs, volume 129 of
ticore machines. The result might be “supercomputing for
                                                                      Electronic Proceedings in Theoretical Computer
the masses”, realizing Chandy’s 1984 vision [2].
                                                                      Science, pages 136–160, 2013.
                                                                 [13] P. Sestoft. Spreadsheet Implementation Technology.
6.   CONCLUSION                                                       Basics and Extensions. MIT Press, 2014. ISBN
We have presented a prototype implementation of sheet-                978-0-262-52664-7. (Expected August 2014). 313
defined functions and outlined some future work. Our hope             pages.
is that such functionality will become available in widely       [14] P. Sestoft. Spreadsheet patents. Technical Report
used spreadsheet programs, or via a full-featured version of          ITU-TR-2014-178, IT University of Copenhagen, 2014.
the plugin described in Section 4, and will enable spread-            ISBN 978-87-7949-317-9. (To appear).
sheet users to develop their own computational models into       [15] P. Sestoft and J. Z. Sørensen. Sheet-defined functions:
reusable function libraries, without loss of computational            implementation and initial evaluation. In Y. Dittrich
efficiency and without handing over control to remote IT              et al., editors, International Symposium on End-User
departments or software contractors. Moreover, there seems            Development, June 2013, volume 7897 of Lecture
to be a technological opportunity to harness the power of             Notes in Computer Science, pages 88–103, 2013.
multicore machines through spreadsheet programming.

7.   REFERENCES
 [1] L. Augustsson, H. Mansell, and G. Sittampalam.
     Paradise: A two-stage DSL embedded in Haskell. In
     International Conference on Functional Programming
     (ICFP’08), pages 225–228. ACM, September 2008.
 [2] M. Chandy. Concurrent programming for the masses.
     (PODC 1984 invited address). In Principles of
     Distributed Computing 1985, pages 1–12. ACM, 1985.
 [3] Ecma TC39 TG3. Common Language Infrastructure
     (CLI). Standard ECMA-335. Ecma International,
     sixth edition, June 2012.
 [4] Excel DNA Project. Homepage. At
     http://exceldna.codeplex.com/ on 28 February 2014.
 [5] N. D. Jones, C. Gomard, and P. Sestoft. Partial
     Evaluation and
     Automatic Program Generation. Prentice Hall, 1993. At
     http://www.itu.dk/people/sestoft/pebook/pebook.html
     on 9 June 2013.
 [6] J. McGraw et al. Sisal. Streams and iteration in a
     single assignment language. Language reference
     manual, version 1.2. Technical report, Lawrence
     Livermore National Labs, March 1985.