<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>MDSheet - Model-Driven Spreadsheets</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jácome Cunha</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>João Paulo Fernandes Rui Pereira João Saraiva</string-name>
          <email>jpaulo@di.uminho.pt</email>
          <email>ruipereira@di.uminho.pt</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jorge Mendes</string-name>
          <email>jorgemendes@di.uminho.pt</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>HASLab/INESC TEC &amp; Universidade do Minho</institution>
          ,
          <addr-line>Portugal CIICESI, ESTGF</addr-line>
          ,
          <institution>Instituto Politécnico do Porto</institution>
          ,
          <addr-line>Portugal RELEASE, Universidade da Beira Interior</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper showcases MDSheet, a framework aimed at improving the engineering of spreadsheets. This framework is model-driven, and has been fully integrated under a spreadsheet system. Also, its practical interest has been demonstrated by several empirical studies.</p>
      </abstract>
      <kwd-group>
        <kwd>Model-Driven Spreadsheets</kwd>
        <kwd>MDSheet</kwd>
        <kwd>Model Inference</kwd>
        <kwd>Embedding</kwd>
        <kwd>Bidirectional Synchronization</kwd>
        <kwd>Querying</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>We can not run the modern world without spreadsheets.
Spreadsheets are omnipresent, from individuals needing to
cope with simple needs to large companies needing to
implement complex forecasts or to produce advanced reports.</p>
      <p>The realization of such importance has made concrete
impact in the scienti c community as well. This is due to more
research teams devoting their e orts to improving
spreadsheets, and a growing number of scienti c events dedicated
to them.</p>
      <p>A successful approach to address spreadsheets under a
scienti c perspective consists of incorporating well-established
software engineering techniques in the spreadsheet
development process.</p>
      <p>Our approach is essentially based on precisely one such
technique: we adopt model-driven spreadsheet engineering.
In the setting we propose a spreadsheet is abstracted through
a concise model, which is then used to improve e
ectiveness and e ciency of spreadsheet users. The framework
we describe in this paper has been realized in a traditional
spreadsheet development system, thus not forcing
spreadsheet users to move to a di erent paradigm.</p>
      <p>The spreadsheet development framework that we envision
has been fully incorporated in a tool, MDSheet1, whose
features include:2
1) Model inference: we extract the abstract representation
from legacy spreadsheets;
2) Embedded models: this abstract representation is
manipulated and evolved in spreadsheets themselves;
3) User guidance: relying on this business model, we are able
of guiding users in avoiding traditional spreadsheet mistakes;
4) Model/instance synchronization: we support the
evolution of model and instances, ensuring an automatic
synchronization of the unevolved artifact;
5) Model quality assessment: a set of metrics on the
complexity of a spreadsheet model can be computed;
6) Querying: spreadsheet data can be queried.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>SPREADSHEET ENGINEERING</title>
      <p>MDSheet is a framework for the engineering of
spreadsheets in a model-driven fashion. This framework is highly
extensible: we have actually extended it with several new
functionalities that we have developed in the last few years.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Motivational Example</title>
      <p>
        The realization of our approach to spreadsheet engineering
builds upon the embedding of ClassSheets in a spreadsheet
system. So, we start by introducing ClassSheets within
MDSheet with the example given in Figure 1: we present a model
for a Budget spreadsheet (Figure 1a), which we adapted
from [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]3, and an instance of such model (Figure 1b).
      </p>
      <p>This model holds three classes where data is to be
inserted by end users: i) Year, with a default value of 2010,
for the budget to accommodate multi-year information, ii)
Category, for assigning a label to each expense and iii),
a(n implicit) relationship class where quantity and costs are
registered and totals are calculated based on them. The
actual spreadsheet may hold several repetitions of any of these
elements, as indicated by the ellipsis. For each expense we
record its quantity and its cost (with 0 as the default value),
and we calculate the total amount associated with it.
Finally, (simple) summation formulas are used to calculate the
global amount spent per year (cell D5), the amount spent
per expense type in all years (cell F3) and the total amount
spent in all years (cell F5) are also calculated.
1MDSheet is available through the SSaaPP project website:
http://ssaapp.di.uminho.pt.
2In the next section, we describe each such feature in a
different subsection.
3We assume colors are visible in the digital version of this
paper.</p>
      <p>(a) Model worksheet.
(b) Data/instance worksheet.</p>
      <p>Following is the description of the full set of features
offered by MDSheet.
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Model Inference</title>
      <p>
        A model-driven approach to spreadsheet engineering
offers an improved development experience: an abstract
representation of a spreadsheet, i.e., its model, helps us, among
other things, in guiding users into preventing errors. This
approach, however, requires the de nition of a model in
parallel with the spreadsheet it abstracts. In order to handle
legacy spreadsheets, i.e., the ones that have not been
developed from scratch together with their model, we have
devised a model inference technique [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], that has been
implemented in MDSheet. Concretely, we infer models in the
ClassSheets language, an object-oriented high-level
formalism to abstract spreadsheets [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
2.3
      </p>
    </sec>
    <sec id="sec-5">
      <title>Embedded Models</title>
      <p>
        The worksheet structure of spreadsheets is a decisive
factor in their modularity. In fact, we exploited precisely this
structure to make the model of a spreadsheet available within
spreadsheet systems themselves: one worksheet holds the
model of a spreadsheet, while another holds its data. This
embedding of spreadsheets has also been implemented under
MDSheet [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which was demonstrated in Section 2.1.
Moreover, we extended the ClassSheets language with database
constraints, such as unique columns/rows or foreign keys,
which have also been incorporated in MDSheet [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. In fact,
we have further extended the available restrictions so that
the user can specify the contents of a cell using regular
expressions or intervals [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Finally, we extended ClassSheets
with references between di erent models making them more
exible. Note that through this embedding we can
guarantee that spreadsheet data always conforms to a model.
2.4
      </p>
    </sec>
    <sec id="sec-6">
      <title>User Guidance</title>
      <p>The embedding of our extended version of the ClassSheet
language allows us to guide the user in inserting correct data.
When a model is designed, it serves as a guider in the
creation of a data worksheet, which is initially empty. Only
cells containing plain data can be edited as all other are
inferred from the model. This prevents, e.g., users from
making mistakes when de ning formulas as they are locked.
Moreover, the restrictions created in the model guarantee
that the data in the cells respects them. In the model it is
possible to de ne an interval of integers for a cell, or a
regular expression that the content must conform to. A column
or row can be marked as having only unique values or being
a foreign key to another column or row. All these
restrictions are enforced by MDSheet. In the case of foreign keys,
the user can use a combo box to select existing values from
the referred column/row.
2.5</p>
    </sec>
    <sec id="sec-7">
      <title>Model/Instance Synchronization</title>
      <p>
        As any other software artifact, spreadsheets evolve over
time. MDSheet accommodates changes by allowing the
evolution of models and instances, while automatically
coevolving the unchanged artifact. For this, we introduced a
formal framework to allow evolutions of the model to be
automatically spanned to the instances [
        <xref ref-type="bibr" rid="ref12 ref6 ref7">6, 7, 12</xref>
        ]. We have
later proposed techniques and tools to the evolution of data
by the user and corresponding automatic coevolution of the
model [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. We therefore ensure that model/instance
consistency is never broken.
2.6
      </p>
    </sec>
    <sec id="sec-8">
      <title>Model Quality Assessment</title>
      <p>
        In a rst attempt to measure the quality of a spreadsheet
model, we introduced a set of metrics to calculate the
complexity of ClassSheet models [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. These metrics are
implemented under MDSheet and can be calculated for any
ClassSheet de ned using it. They are then compared to the same
metrics computed for a repository of ClassSheet models so
users can have a reference point for such values. The
evolution mechanisms can then be used to evolve the spreadsheet
improving it according to the metrics calculated.
2.7
      </p>
    </sec>
    <sec id="sec-9">
      <title>Querying</title>
      <p>
        As many spreadsheets are used as data repositories, the
need to query their data is frequent. MDSheet also
integrates a query system, which allows the de nition of
modeloriented queries, in the style of traditional database queries.
This allows the writing of queries without having to
manually observe a possibly large number of columns and rows
of concrete data. Indeed, queries are written, by analyzing
models, as abstractions that are simpler to understand. Our
system was initially presented as a textual language [
        <xref ref-type="bibr" rid="ref1 ref4">1, 4</xref>
        ],
very similar to SQL. Even being textual it already was of
great help for users [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Still, we have further improved it
by embedding the language in a worksheet, thus creating a
visual language for spreadsheet querying [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
    </sec>
    <sec id="sec-10">
      <title>EMPIRICAL VALIDATION</title>
      <p>
        One of the purposes of our tool is to help users commit
less errors; if possible, it also intends to help users work
faster with spreadsheets. To assess these two concerns we
have run an empirical study and we have found empirical
evidence that indeed our model-driven spreadsheet
environment can in fact help users become more e cient and more
e ective [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSION</title>
      <p>in an agile testing framework of a software house; ii), to
adapt data produced by di erent database systems for a
car multimedia production company; and iii), to provide
spreadsheet models for a food bank.</p>
    </sec>
    <sec id="sec-12">
      <title>Acknowledgments</title>
      <p>This work is part funded by the ERDF - European Regional
Development Fund through the COMPETE Programme
(operational programme for competitiveness) and by National
Funds through the FCT - Fundaca~o para a Ci^encia e a
Tecnologia (Portuguese Foundation for Science and Technology)
within project FCOMP-01-0124-FEDER-020532. The rst
author was funded by the FCT grant SFRH/BPD/73358/2010.
5.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>O.</given-names>
            <surname>Belo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pereira</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>Querysheet: A bidirectional query environment for model-driven spreadsheets</article-title>
          .
          <source>In VLHCC '13</source>
          , pages
          <fpage>199</fpage>
          {
          <fpage>200</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
            <given-names>CS</given-names>
          </string-name>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Erwig</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>Automatically inferring classsheet models from spreadsheets</article-title>
          .
          <source>In VLHCC '10</source>
          , pages
          <fpage>93</fpage>
          {
          <fpage>100</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
            <given-names>CS</given-names>
          </string-name>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Pacheco</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>Bidirectional transformation of model-driven spreadsheets</article-title>
          .
          <source>In ICMT '12</source>
          , volume
          <volume>7307</volume>
          <source>of LNCS</source>
          , pages
          <volume>105</volume>
          {
          <fpage>120</fpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pereira</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>Querying model-driven spreadsheets</article-title>
          .
          <source>In VLHCC '13</source>
          , pages
          <fpage>83</fpage>
          {
          <fpage>86</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
            <given-names>CS</given-names>
          </string-name>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pereira</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>Embedding model-driven spreadsheet queries in spreadsheet systems</article-title>
          .
          <source>In VLHCC '14</source>
          ,
          <year>2014</year>
          . to appear.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>Embedding and evolution of spreadsheet models in spreadsheet systems</article-title>
          .
          <source>In VLHCC '11</source>
          , pages
          <fpage>186</fpage>
          {
          <fpage>201</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendes</surname>
          </string-name>
          , and
          <string-name>
            <surname>J. Saraiva.</surname>
          </string-name>
          <article-title>MDSheet: A framework for model-driven spreadsheet engineering</article-title>
          .
          <source>In ICSE 2012</source>
          , pages
          <fpage>1412</fpage>
          {
          <fpage>1415</fpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>Extension and implementation of classsheet models</article-title>
          .
          <source>In VLHCC '12</source>
          , pages
          <fpage>19</fpage>
          {
          <fpage>22</fpage>
          .
          <string-name>
            <surname>IEEE</surname>
            <given-names>CS</given-names>
          </string-name>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>Complexity Metrics for ClassSheet Models</article-title>
          .
          <source>In ICCSA '13</source>
          , volume
          <volume>7972</volume>
          , pages
          <fpage>459</fpage>
          {
          <fpage>474</fpage>
          .
          <string-name>
            <surname>LNCS</surname>
          </string-name>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          . Embedding, evolution, and
          <article-title>validation of spreadsheet models in spreadsheet systems</article-title>
          .
          <year>2014</year>
          . submitted.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>From Relational ClassSheets to UML+OCL</article-title>
          . In SAC '
          <volume>12</volume>
          , pages
          <fpage>1151</fpage>
          {
          <fpage>1158</fpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Visser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Alves</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Saraiva</surname>
          </string-name>
          .
          <article-title>Type-safe evolution of spreadsheets</article-title>
          . In D. Giannakopoulou and F. Orejas, editors,
          <source>FASE '11</source>
          , volume
          <volume>6603</volume>
          <source>of LNCS</source>
          , pages
          <volume>186</volume>
          {
          <fpage>201</fpage>
          . Springer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Engels</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Erwig</surname>
          </string-name>
          .
          <article-title>ClassSheets: automatic generation of spreadsheet applications from object-oriented speci cations</article-title>
          .
          <source>In ASE '05</source>
          , pages
          <fpage>124</fpage>
          {
          <fpage>133</fpage>
          . ACM,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Pereira</surname>
          </string-name>
          .
          <article-title>Querying for model-driven spreadsheets</article-title>
          .
          <source>Master's thesis</source>
          , University of Minho,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>