=Paper=
{{Paper
|id=Vol-1209/paper10
|storemode=property
|title=MDSheet: Model-Driven Spreadsheets
|pdfUrl=https://ceur-ws.org/Vol-1209/paper_11.pdf
|volume=Vol-1209
}}
==MDSheet: Model-Driven Spreadsheets==
MDSheet – Model-Driven Spreadsheets
Jácome Cunha João Paulo Fernandes Jorge Mendes
Rui Pereira João Saraiva
{jacome,jpaulo,jorgemendes,ruipereira,jas}@di.uminho.pt
HASLab/INESC TEC & Universidade do Minho, Portugal
CIICESI, ESTGF, Instituto Politécnico do Porto, Portugal
RELEASE, Universidade da Beira Interior, Portugal
ABSTRACT has been fully incorporated in a tool, MDSheet1 , whose fea-
This paper showcases MDSheet, a framework aimed at im- tures include:2
proving the engineering of spreadsheets. This framework is 1) Model inference: we extract the abstract representation
model-driven, and has been fully integrated under a spread- from legacy spreadsheets;
sheet system. Also, its practical interest has been demon- 2) Embedded models: this abstract representation is manip-
strated by several empirical studies. ulated and evolved in spreadsheets themselves;
3) User guidance: relying on this business model, we are able
of guiding users in avoiding traditional spreadsheet mistakes;
Categories and Subject Descriptors 4) Model/instance synchronization: we support the evolu-
tion of model and instances, ensuring an automatic synchro-
H.4 [Information Systems Applications]: Spreadsheets;
nization of the unevolved artifact;
D.2.0 [Software Engineering]: General; D.2.6 [Software
5) Model quality assessment: a set of metrics on the com-
Engineering]: Programming Environments—Graphical en-
plexity of a spreadsheet model can be computed;
vironments, Integrated environments, Interactive environ-
6) Querying: spreadsheet data can be queried.
ments
2. SPREADSHEET ENGINEERING
General Terms MDSheet is a framework for the engineering of spread-
Languages, Design, Human Factors sheets in a model-driven fashion. This framework is highly
extensible: we have actually extended it with several new
functionalities that we have developed in the last few years.
Keywords
Model-Driven Spreadsheets, MDSheet, Model Inference, Em- 2.1 Motivational Example
bedding, Bidirectional Synchronization, Querying The realization of our approach to spreadsheet engineering
builds upon the embedding of ClassSheets in a spreadsheet
system. So, we start by introducing ClassSheets within MD-
1. INTRODUCTION Sheet with the example given in Figure 1: we present a model
We can not run the modern world without spreadsheets. for a Budget spreadsheet (Figure 1a), which we adapted
Spreadsheets are omnipresent, from individuals needing to from [13]3 , and an instance of such model (Figure 1b).
cope with simple needs to large companies needing to im- This model holds three classes where data is to be in-
plement complex forecasts or to produce advanced reports. serted by end users: i) Year, with a default value of 2010,
The realization of such importance has made concrete im- for the budget to accommodate multi-year information, ii)
pact in the scientific community as well. This is due to more Category, for assigning a label to each expense and iii),
research teams devoting their efforts to improving spread- a(n implicit) relationship class where quantity and costs are
sheets, and a growing number of scientific events dedicated registered and totals are calculated based on them. The ac-
to them. tual spreadsheet may hold several repetitions of any of these
A successful approach to address spreadsheets under a sci- elements, as indicated by the ellipsis. For each expense we
entific perspective consists of incorporating well-established record its quantity and its cost (with 0 as the default value),
software engineering techniques in the spreadsheet develop- and we calculate the total amount associated with it. Fi-
ment process. nally, (simple) summation formulas are used to calculate the
Our approach is essentially based on precisely one such global amount spent per year (cell D5), the amount spent
technique: we adopt model-driven spreadsheet engineering. per expense type in all years (cell F3) and the total amount
In the setting we propose a spreadsheet is abstracted through spent in all years (cell F5) are also calculated.
a concise model, which is then used to improve effective- 1
MDSheet is available through the SSaaPP project website:
ness and efficiency of spreadsheet users. The framework http://ssaapp.di.uminho.pt.
we describe in this paper has been realized in a traditional 2
In the next section, we describe each such feature in a dif-
spreadsheet development system, thus not forcing spread- ferent subsection.
sheet users to move to a different paradigm. 3
We assume colors are visible in the digital version of this
The spreadsheet development framework that we envision paper.
(a) Model worksheet.
(b) Data/instance worksheet.
Figure 1: A bidirectional model-driven environment for a budget spreadsheet.
Following is the description of the full set of features of- embedding of spreadsheets has also been implemented under
fered by MDSheet. MDSheet [6], which was demonstrated in Section 2.1. More-
over, we extended the ClassSheets language with database
2.2 Model Inference constraints, such as unique columns/rows or foreign keys,
A model-driven approach to spreadsheet engineering of- which have also been incorporated in MDSheet [11]. In fact,
fers an improved development experience: an abstract rep- we have further extended the available restrictions so that
resentation of a spreadsheet, i.e., its model, helps us, among the user can specify the contents of a cell using regular ex-
other things, in guiding users into preventing errors. This pressions or intervals [8]. Finally, we extended ClassSheets
approach, however, requires the definition of a model in par- with references between different models making them more
allel with the spreadsheet it abstracts. In order to handle flexible. Note that through this embedding we can guaran-
legacy spreadsheets, i.e., the ones that have not been de- tee that spreadsheet data always conforms to a model.
veloped from scratch together with their model, we have
devised a model inference technique [2], that has been im- 2.4 User Guidance
plemented in MDSheet. Concretely, we infer models in the
The embedding of our extended version of the ClassSheet
ClassSheets language, an object-oriented high-level formal-
language allows us to guide the user in inserting correct data.
ism to abstract spreadsheets [13].
When a model is designed, it serves as a guider in the cre-
ation of a data worksheet, which is initially empty. Only
2.3 Embedded Models cells containing plain data can be edited as all other are
The worksheet structure of spreadsheets is a decisive fac- inferred from the model. This prevents, e.g., users from
tor in their modularity. In fact, we exploited precisely this making mistakes when defining formulas as they are locked.
structure to make the model of a spreadsheet available within Moreover, the restrictions created in the model guarantee
spreadsheet systems themselves: one worksheet holds the that the data in the cells respects them. In the model it is
model of a spreadsheet, while another holds its data. This possible to define an interval of integers for a cell, or a regu-
lar expression that the content must conform to. A column in an agile testing framework of a software house; ii), to
or row can be marked as having only unique values or being adapt data produced by different database systems for a
a foreign key to another column or row. All these restric- car multimedia production company; and iii), to provide
tions are enforced by MDSheet. In the case of foreign keys, spreadsheet models for a food bank.
the user can use a combo box to select existing values from
the referred column/row. Acknowledgments
2.5 Model/Instance Synchronization This work is part funded by the ERDF - European Regional
Development Fund through the COMPETE Programme (op-
As any other software artifact, spreadsheets evolve over
erational programme for competitiveness) and by National
time. MDSheet accommodates changes by allowing the evo-
Funds through the FCT - Fundação para a Ciência e a Tec-
lution of models and instances, while automatically coevolv-
nologia (Portuguese Foundation for Science and Technology)
ing the unchanged artifact. For this, we introduced a for-
within project FCOMP-01-0124-FEDER-020532. The first au-
mal framework to allow evolutions of the model to be au-
thor was funded by the FCT grant SFRH/BPD/73358/2010.
tomatically spanned to the instances [6, 7, 12]. We have
later proposed techniques and tools to the evolution of data
by the user and corresponding automatic coevolution of the 5. REFERENCES
model [3]. We therefore ensure that model/instance consis- [1] O. Belo, J. Cunha, J. P. Fernandes, J. Mendes,
tency is never broken. R. Pereira, and J. Saraiva. Querysheet: A bidirectional
query environment for model-driven spreadsheets. In
2.6 Model Quality Assessment VLHCC ’13, pages 199–200. IEEE CS, 2013.
In a first attempt to measure the quality of a spreadsheet [2] J. Cunha, M. Erwig, and J. Saraiva. Automatically
model, we introduced a set of metrics to calculate the com- inferring classsheet models from spreadsheets. In
plexity of ClassSheet models [9]. These metrics are imple- VLHCC ’10, pages 93–100. IEEE CS, 2010.
mented under MDSheet and can be calculated for any Class- [3] J. Cunha, J. P. Fernandes, J. Mendes, H. Pacheco,
Sheet defined using it. They are then compared to the same and J. Saraiva. Bidirectional transformation of
metrics computed for a repository of ClassSheet models so model-driven spreadsheets. In ICMT ’12, volume 7307
users can have a reference point for such values. The evolu- of LNCS, pages 105–120. Springer, 2012.
tion mechanisms can then be used to evolve the spreadsheet [4] J. Cunha, J. P. Fernandes, J. Mendes, R. Pereira, and
improving it according to the metrics calculated. J. Saraiva. Querying model-driven spreadsheets. In
VLHCC ’13, pages 83–86. IEEE CS, 2013.
2.7 Querying [5] J. Cunha, J. P. Fernandes, J. Mendes, R. Pereira, and
As many spreadsheets are used as data repositories, the J. Saraiva. Embedding model-driven spreadsheet
need to query their data is frequent. MDSheet also inte- queries in spreadsheet systems. In VLHCC ’14, 2014.
grates a query system, which allows the definition of model- to appear.
oriented queries, in the style of traditional database queries. [6] J. Cunha, J. P. Fernandes, J. Mendes, and J. Saraiva.
This allows the writing of queries without having to man- Embedding and evolution of spreadsheet models in
ually observe a possibly large number of columns and rows spreadsheet systems. In VLHCC ’11, pages 186–201.
of concrete data. Indeed, queries are written, by analyzing [7] J. Cunha, J. P. Fernandes, J. Mendes, and J. Saraiva.
models, as abstractions that are simpler to understand. Our MDSheet: A framework for model-driven spreadsheet
system was initially presented as a textual language [1, 4], engineering. In ICSE 2012, pages 1412–1415. ACM.
very similar to SQL. Even being textual it already was of
[8] J. Cunha, J. P. Fernandes, J. Mendes, and J. Saraiva.
great help for users [14]. Still, we have further improved it
Extension and implementation of classsheet models. In
by embedding the language in a worksheet, thus creating a
VLHCC ’12, pages 19–22. IEEE CS, 2012.
visual language for spreadsheet querying [5].
[9] J. Cunha, J. P. Fernandes, J. Mendes, and J. Saraiva.
Complexity Metrics for ClassSheet Models. In
3. EMPIRICAL VALIDATION ICCSA ’13, volume 7972, pages 459–474. LNCS, 2013.
One of the purposes of our tool is to help users commit [10] J. Cunha, J. P. Fernandes, J. Mendes, and J. Saraiva.
less errors; if possible, it also intends to help users work Embedding, evolution, and validation of spreadsheet
faster with spreadsheets. To assess these two concerns we models in spreadsheet systems. 2014. submitted.
have run an empirical study and we have found empirical [11] J. Cunha, J. P. Fernandes, and J. Saraiva. From
evidence that indeed our model-driven spreadsheet environ- Relational ClassSheets to UML+OCL. In SAC ’12,
ment can in fact help users become more efficient and more pages 1151–1158. ACM, 2012.
effective [10]. [12] J. Cunha, J. Visser, T. Alves, and J. Saraiva.
Type-safe evolution of spreadsheets. In
4. CONCLUSION D. Giannakopoulou and F. Orejas, editors, FASE ’11,
We briefly presented MDSheet and all the features it of- volume 6603 of LNCS, pages 186–201. Springer, 2011.
fers to its users. Given the fact that it has been built as a [13] G. Engels and M. Erwig. ClassSheets: automatic
framework, new tools, even if not proposed by us, can easily generation of spreadsheet applications from
be integrated in it. object-oriented specifications. In ASE ’05, pages
We believe this tool is in a very mature state and can 124–133. ACM, 2005.
be used in real case scenarios. We have thus started its [14] R. Pereira. Querying for model-driven spreadsheets.
integration in industry: i), to support test case evolution Master’s thesis, University of Minho, 2013.