<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Preparing Meta-Analysis of Metamodel Understandability</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Susanne Patig</string-name>
          <email>susanne.patig@iwi.unibe.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Bern, IWI</institution>
          ,
          <addr-line>Engehaldenstrasse 8, CH-3012 Bern</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <fpage>11</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>Metamodels are designed to be used by machines and humans. For human users, the understandability of the metamodel is important. Experimental investigations of understandability in computer science have led to conflicting results. To resolve such conflicts and gain insights into the nature of some phenomenon beyond singular experiments, meta-analysis can be applied, i.e., the statistical analysis of results obtained by other (primary) empirical studies. This paper shows the current obstacles for a meta-analysis of metamodel understandability: They consist in the heterogeneity of the individual experiments and deficient reporting. The paper provides a framework to increase the comparability of experiments on understandability. Such comparability enables future meta-analysis.</p>
      </abstract>
      <kwd-group>
        <kwd>Understandability</kwd>
        <kwd>Metamodels</kwd>
        <kwd>Experimental Research</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Motivation</title>
      <p>
        Designing and modifying metamodels are major topics of model-driven development.
Metamodels must be understandable for both machine and human users. Following a
definition of language understandability in cognitive psychology [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the
understandability of a metamodel means the effort required to read and correctly interpret its
constructs and their connections. Understandability is a prerequisite both for reading
artifacts (like documents or source code) that have been created by a applying a
metamodel (comprehension) and for creating such artifacts (specification).
      </p>
      <p>The ‘understandability’ of a metamodel for a machine shows up in error-free
compilation. For human users, metamodel understandability must be empirically
investigated, usually by controlled experiments. The results of such experiments are
conflicting (see Section 3).</p>
      <p>
        Conflicting empirical results can be statistically evaluated by meta-analysis (see
Section 2). Meta-analysis could increase our knowledge about the nature of
understandability – also to facilitate future metamodel design or modification. But, Section
3 shows that meta-analysis on metamodel understandability is currently hindered by
(1) the heterogeneity of the conducted experiments and (2) insufficient reporting of
the experimental results. This paper provides a framework to achieve comparability of
experiments on metamodel understandability (see Section 4), which is a prerequisite
for meta-analysis. Appropriate reporting guidelines exist (e.g., [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]).
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Meta-Analysis</title>
      <p>
        Meta-Analysis is the statistical evaluation of numerical results that have been obtained
by other (primary) studies [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Hence, it is a kind of secondary research that
aims at (1) finding evidence for some investigated phenomenon beyond individual
studies (by calculating general descriptive statistics), (2) explaining conflicting results
(by discovering new influencing variables), and (3) removing the bias potentially
contained in ‘normal’ literature reviews of empirical studies [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
      </p>
      <p>
        Literature reviews often concentrate only on significant results that support the
reviewer’s theoretical position. But, statistical significance can be misleading, because
it is affected by sample size [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]: If the same experiment is conducted independently, a
larger sample may yield a statistically significant result, while a smaller one does not.
The ‘empirical truth’ can be revealed by effect size. Effect size expresses the
magnitude of a result, independently of sample size [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Table 1 summarizes main effect
size measures and defines what constitutes a small, medium or large effect.
Meta-analysis typically integrates the effect sizes of singular studies. The basic steps
are as follows [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]:
1. Define the independent and the dependent variables of interest.
2. Systematically collect the studies to be included in the meta-analysis.
3. Estimate effect sizes for each study.
4. Combine the individual effect sizes to calculate and test the central tendency (e.g.,
the mean or median) and dispersion (e.g., variance) of the overall effect.
Various ways of combining effect sizes exist (see, e.g. [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]). The combined effect size
quantifies the overall magnitude of some observed result, at least in the population of
the included studies. To yield useful results from meta-analysis, the included studies
must satisfy the following requirements [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]:
[RQ1] They must be of the same type (e.g., controlled experiments or case studies).
[RQ2] They must test the same hypothesis. Since a statistical hypothesis assumes
that the independent variable(s) will cause the changes in the dependent
variable(s) [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ], these variables should be identical or comparable.
[RQ3] Often several measures for the same variable exist. Ideally, all included
studies should use the same or comparable measures.
[RQ4] The studies should report effect sizes or provide at least statistics according to
      </p>
      <p>Table 1 or raw data to calculate the effect sizes.</p>
      <p>The next section will show that current experiments on the understandability of
metamodels do not satisfy these requirements.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Meta-Analysis of Research on Metamodel Understandability</title>
      <p>
        This section sketches a failed attempt of meta-analysis – to prepare the ground for the
framework in Section 4. The intended meta-analysis should find out whether certain
(types of) metamodels have proven to be generally better understandable for human
users. One of the earliest disputes relevant for this question took place in artificial
intelligence by praising the merits of either predicate logic [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], which is usually
written as text, or visual representations and diagrams [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. This debate is excluded
here from further investigation as it is based only on (quite suggestive) examples and,
thus, differs in type from controlled experiments (see [RQ1] in Section2).
      </p>
      <p>Table 2 lists some experiments examining the understandability of (types of)
metamodels. The selection of the studies (deliberately) does not satisfy the requirements
postulated in Section 2, as it is intended to point out the obstacles for meta-analysis:</p>
      <p>
        The experiments differ in their independent variables and, thus, in the hypotheses
(Ha)1 investigated. Most independent variables are related to metamodels, but refer to
abstract2 syntax ([
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]: Ha: metamodels with more constructs easier to understand),
concrete2 syntax ([
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]; Ha: graphical notation is easier to understand) or a mixture
of both ([
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]). In the mixture case, the understandability of particular
metamodels (the listed ‘levels’ in Table 2) is tested, whereas syntactically pure
independent variables characterize types of metamodels. Besides the metamodel, also
other factors influencing understandability are investigated, e.g., the complexity of the
presented artifacts [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and the knowledge of the participants [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].
      </p>
      <p>
        The dependent variables are more homogeneous (correctness, time, perceived ease
of use), but the particular measures vary. For example, correctness is quantified by the
number of correct answers and by reviews. Additionally, diverse experimental
designs have been used. Experimental design, i.e., the way participants are selected
and assigned to experimental conditions [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], is discussed in Section 4.2
      </p>
      <p>
        None of the studies in Table 2 reported effect sizes. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] and [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]
provide at least enough aggregated data to calculate the effect sizes ex post according
1 Ha denotes the alternative hypothesis, which is given in an aggregated and simplified form.
2 Abstract (concrete) syntax mean the constructs and their allowed connections (notation).
to Table 1. The effects are small [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], medium [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] or large [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]; see
Table 2. But, because of the heterogeneous variables and hypotheses, a
methodologically sound meta-analysis cannot be conducted.
      </p>
      <p>
        Meta-analysis of understandability would be facilitated by some guideline for the
planning, conducting and reporting of the underlying experiments. The following
groups of guidelines have been proposed:
1. General guidelines on experimental research in software engineering (e.g., [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]) with ‘best practices’ for planning, conducting, evaluating and reporting any
kind of experiment. They do not help researchers in selecting variables and
experimental designs to investigate understandability.
2. Guidelines on reporting the results of experiments e.g., [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Though the
latter ones have recently been criticized [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], they provide a solid foundation for
the prospective availability of data needed to calculate effect sizes.
3. Guidelines for experiments in the field of conceptual modeling, e.g. 23], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
or management information systems (MIS) research [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]: These guidelines cover
specific aspects of metamodel understandability (e.g., the role of domain
knowledge) [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], remain vague sets of hints without well-founded recommendations of
variables or experimental designs [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or aim at classifying existing experimental
studies [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. As a consequence, the classification guideline [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] concentrates on
variables that have been used in experiments on metamodel understandability, but
neglects potential variables known from cognitive psychology, which is the major
field for scientific investigations of understandability. Meta-analytic comparability,
however, requires the consideration of all known factors affecting some
phenomenon. Experimental design is only discussed in the MIS research framework [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
Because of focusing on the usage of MIS, ‘metamodel’ is not considered as an
independent variable. Corresponding modifications of the framework have been
proposed [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], but remain at the surface. Additionally, the MIS research framework
differs in terminology and methodology form empirical software engineering.
To sum it up, owing to heterogeneous experiments and deficient reporting of the
experimental results, meta-analysis of metamodel understandability is currently not
possible. Appropriate reporting guidelines exist. The next section proposes a
framework that is to increase the comparability of experiments on metamodel
understandability, which is a prerequisite for meta-analysis.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>A Framework for Comparable Experiments on Metamodel</title>
    </sec>
    <sec id="sec-5">
      <title>Understandability</title>
      <sec id="sec-5-1">
        <title>4.1 Affecting Factors</title>
        <p>
          An experiment is a scientific investigation in which one or more independent
variables (IV) are systematically manipulated to observe their effects on one or more
dependent variables (DV) [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. The outcome of an experiment depends on the
affecting factors [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. This term comprises both independent variables whose (causal)
relationship to the dependent variables is examined and other factors (extraneous
variables, EV) that confound the causal results [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. Whether some affecting factor
constitutes an independent or an extraneous variable is, to some extent, a matter of the
researcher’s decision (contingent on the research question, the availability of
participants, costs etc.). This decision requires knowledge on (at most) all the factors
that affect the outcomes of an experiment. For experiments on understandability in
computer science, this knowledge is provided by Fig. 1.
        </p>
        <p>Affecting factors for experiments</p>
        <p>Understandability
(Potential independent variables)</p>
        <p>General
(Extraneous variables)
Modeling</p>
        <p>Task</p>
        <p>Participants Conduct Experimenter
Metamodel Content Tool</p>
        <p>Type</p>
        <p>Size Knowledge</p>
        <p>Demographics</p>
        <p>Surface Level</p>
        <p>Problem Solving</p>
        <p>Syntactic Semantic</p>
        <p>
          It can be distinguished between factors that affect the outcome of any experiment
(general affecting factors) and factors with a known influence on understandability;
see Fig. 1. In the field of behavioral sciences (to which cognitive psychology
belongs), the following general affecting factors are acknowledged:
• The conduct of the experiment, comprising:
• The experimental situation, namely the location (noise, room temperature), the
time of day and the equipment (failures, calibration) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
• Position effects: Performance depends on the timely distance of a task from the
start of the experiment (e.g., fatigue, getting bored, learning) [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ].
• Carry-over effect: The performance achieved in some task depends on whether
or not some other task has been done before [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ].
• The experimenter: His/her ability to instruct participants; his/her bias (expecting a
particular outcome can distort the experimenter’s behavior or data gathering) [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ].
These general affecting factors are not causally related to the dependent variables, but
distort the experimental results and, thus, are extraneous variables. In contrast, in
investigating metamodel understandability, the following affecting factors – related to
modeling, participants and task - are potential independent variables (see Fig. 1):
        </p>
        <p>
          Both the metamodel’s abstract syntax (e.g., the number [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] or type [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] of
constructs) and its concrete syntax (graphical vs. textual notation; e.g., [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]) affect
understandability. Metamodels cannot be tested in isolation, but only by applying them to
some content. The content should be ‘informationally equivalent’ [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], i.e., it must be
possible to model this content by any of the investigated metamodels, and the content
should be comparably difficult. Finally, the tool used to create or present models (e.g.,
its navigation or dynamic layout capabilities) influences understandability.
        </p>
        <p>
          Among the affecting factors, participants play an intermediate role: Their
demographic characteristics (e.g., age, gender) affect any experiment [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] and, thus, also
understandability. For example, the participants’ age is treated as an independent
variable in MIS research [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Knowledge comprises experience and skills related to
domain and metamodel as well as general mental abilities. Domain knowledge
distorts results on metamodel understandability as it enables inferences [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. Metamodel
knowledge is usually provided in preparing the participants for the experiment.
        </p>
        <p>
          Tasks in experiments on understandability can be characterized by their type and
size. As Table 2 indicates, the task types used are comprehension or specification
(defined in Section 1), which agree to the dependent variables cognitive psychology
suggests (see Section 4.3). Comprehension tasks can be subdivided into surface-level
understanding and problem-solving tasks [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. In problem-solving tasks, participants
are requested to determine whether and how certain information can be retrieved from
an artifact created by applying the metamodel. In contrast, syntactic surface level
understanding tasks refer to the constructs of the metamodel and their relationships
(e.g., ‘How many attributes describe the entity type ORDER?’), whereas semantic
tasks assess the understanding of the contents described (e.g., ‘Every employee has
(a) a unique employee number, (b) more than one employee number.’) [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. An
influence of the size of some task, e.g., the complexity of the database described by
some metamodel, is generally assumed, but it was only marginally significant in [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>
          Depending on the decision of the researcher, a potential independent variable is
either systematically manipulated or becomes an extraneous variable. Extraneous
variables decrease the internal validity of experiments, i.e., the degree to which the
variation of the dependent variables can be attributed to the independent variables
(rather than to some other factor) [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. Consequently, extraneous variables must be
controlled, which is main constituent of experimental design (see Section 4.2).
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2 Experimental Design</title>
        <p>
          An experimental design can be regarded as a general plan for (types of) experiments
that joins independent variables and control techniques for extraneous variables. The
main control techniques are removing, constancy and randomization [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ], [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]; they
should be applied in the following order:
1. Remove the extraneous variable (EV), especially if it is related to the experimental
situation (e.g., use a quite room).
2. If the EV cannot be removed, its influence on the dependent variable is known and
the sample is small, keep the EV constant. Constancy guarantees that all conditions
are identical except for the manipulation of the independent variable, but reduces
the external validity of the experiment, i.e., its generalizability [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ].
3. If sample size does not matter and the influence of some irremovable EV on the
dependent variable is not surely known (e.g., gender), must be neutralized (e.g.,
position or carry-over effects) or should be equated (e.g., age, knowledge),
randomize the EV. Randomization increases the external validity of experiments.
The experimental design to be chosen depends on (1) the number of independent
variables and (2) the control technique. Table 3 summarizes typical experimental
designs and their (dis-) advantages (for details, see [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ], [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]). Experimental
design and the dependent variables determine the statistical test procedures for
evaluation (see Table 3). For each statistical procedure, an effect size measure exists
(see Table 1). The sample size required to detect a small, medium or large effect for a
given experimental design and statistical test procedure can be calculated by power
analysis (e.g., [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]); the resulting recommendations are given in Table 3.
4.3
        </p>
      </sec>
      <sec id="sec-5-3">
        <title>Affected Factors</title>
        <p>
          The dependent variable is the one on which the effect of the independent variable is
measured. Behaviorism, the origin of experimental research in psychology, requires
the dependent variable to refer to observable behavior [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Thus, ‘perceived ease of
use’ (even though applied, see Table 2) is not an acceptable dependent variable.
Instead, the following measures of behavior are common [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]:
1. Frequency, e.g., the number of correct answers or solved problems.
2. Selection, e.g., which of several answers is chosen.
3. Response latency (or response time), which is concerned with how long it takes for
a behavior to be emitted, e.g., how quickly a participant reacts.
4. Response duration, i.e., the length of time some behavior occurs (e.g., how long a
participant deals with a task).
5. Amplitude, measuring the strength of response.
        </p>
        <p>
          The dependent variables in experiments on metamodel understandability (see Table 2)
use these measures as follows: Solution time refers to response latency and modeling
time to response duration. If correctness is verified by multiple-choice questions (e.g.,
[
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]), it is based on the measure ‘selection’, whereas numbers of correct answers are a
measure of frequency.
        </p>
        <p>
          Thus, the dependent variables in experiments on understandability in computer
science are well-grounded in cognitive psychology. Completeness could be achieved
by measuring amplitude, which, however, is mainly common in neuroscience [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], and
by using selection of some metamodel from a list in specification tasks.
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5 Conclusion</title>
      <p>
        Missing comparability of the integrated studies is a major reservation about
metaanalysis [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. But, comparability of heterogeneous experiments can be achieved by
methodologically equalizing differences among experiments [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] – provided that the
differences are known. In other words, sound meta-analysis is possible if all variables
and (for EV) their control techniques are reported. The taxonomies provided by the
framework (see Section 4) help researchers to compile such lists; further advances can
be achieved by web-publishing them (and the related experimental studies) as well as
by tool support for the experiments on understandability. A simple open-source tool
called notate already exists (http://sourceforge.net/projects/notate). It has been
successfully applied in experiments on understandability [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] and can be extended to
cover the complete framework of Section 4.
      </p>
      <p>
        In contrast to the narrow view of MIS research, extensibility and flexibility are
major requirements for a framework to investigate understandability in computer
science, since the nature of language understanding in general still is an open research
question in cognitive psychology [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Workshops are an appropriate place to
exchange experience in this field and to advance the framework proposed here.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          :
          <source>Cognitive Psychology and its Implications</source>
          . 5th ed.,
          <string-name>
            <surname>Worth</surname>
          </string-name>
          , New York (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Aranda</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ernst</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horkoff</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Easterbrook</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A Framework for Empirical Evaluation of Model Comprehensibility</article-title>
          .
          <source>Proc. Intern. Workshop on Modeling in Software Engineering (MISE'07)</source>
          , Minneapolis/ MN. IEEE (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bajaj</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The effect of the number of concepts on the readability of schemas: an empirical study with data models</article-title>
          .
          <source>Requirements Engineering</source>
          <volume>9</volume>
          ,
          <fpage>261</fpage>
          -
          <lpage>270</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Basili</surname>
            ,
            <given-names>V.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Selby</surname>
            ,
            <given-names>R.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hutchens</surname>
            ,
            <given-names>D.H.</given-names>
          </string-name>
          :
          <article-title>Experimentation in Software Engineering</article-title>
          .
          <source>IEEE Transactions on Software Engineering SE-12 7</source>
          ,
          <fpage>733</fpage>
          -
          <lpage>743</lpage>
          (
          <year>1986</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Batra</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoffer</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bostrom</surname>
            ,
            <given-names>R.P.</given-names>
          </string-name>
          :
          <article-title>Comparing Representations with Relational and EER Models</article-title>
          .
          <source>Comm. of the ACM</source>
          <volume>33</volume>
          ,
          <fpage>126</fpage>
          -
          <lpage>139</lpage>
          (
          <year>1990</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Bock</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ryan</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Accuracy in Modeling with Extended Entity Relationship and Object Oriented Data Models</article-title>
          .
          <source>J. of Database Management</source>
          <volume>4</volume>
          ,
          <fpage>30</fpage>
          -
          <lpage>39</lpage>
          (
          <year>1993</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Bodart</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sim</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weber</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Should Optional Properties Be Used in Conceptual Modelling? A Theory and Three Empirical Tests</article-title>
          .
          <source>Information Systems Research</source>
          <volume>12</volume>
          ,
          <fpage>384</fpage>
          -
          <lpage>405</lpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>H.C.</given-names>
          </string-name>
          :
          <article-title>Naturalness of Graphical Queries Based on the Entity Relationship Model</article-title>
          .
          <source>J. of Database Management</source>
          <volume>6</volume>
          ,
          <fpage>3</fpage>
          -
          <lpage>13</lpage>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Clark-Carter</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          : Quantitative psychological research. Psychology Press, Hove (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Cohen</surname>
          </string-name>
          , J.:
          <article-title>Statistical Power Analysis for the Behavioral Sciences</article-title>
          . 2nd ed., Erlbaum, Hillsdale (
          <year>1988</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Gemino</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wand</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>A framework for empirical evaluation of conceptual modeling techniques</article-title>
          .
          <source>Requirements Engineering</source>
          <volume>9</volume>
          ,
          <fpage>248</fpage>
          -
          <lpage>260</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Hayes</surname>
            ,
            <given-names>P.J.:</given-names>
          </string-name>
          <article-title>Some Problems and Non-Problems in Representation Theory</article-title>
          .
          <source>Proc. of the AISB Summer Conference</source>
          . University of Sussex,
          <fpage>63</fpage>
          -
          <lpage>79</lpage>
          (
          <year>1974</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Hwang</surname>
            ,
            <given-names>M.I.</given-names>
          </string-name>
          :
          <article-title>The Use of Meta-Analysis in MIS: Research: Promises and Problems</article-title>
          .
          <source>The DATA BASE for Advances in Information Systems</source>
          <volume>27</volume>
          ,
          <fpage>35</fpage>
          -
          <lpage>48</lpage>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Jamison</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teng</surname>
          </string-name>
          , J.T.C.
          <article-title>: Effects of Graphical Versus Textual Representation of Database Structure on Query Performance</article-title>
          .
          <source>J. of Database Management</source>
          <volume>4</volume>
          ,
          <fpage>16</fpage>
          -
          <lpage>23</lpage>
          (
          <year>1993</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Jenkins</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>MIS Design Variables and Decision Making Performance</article-title>
          . UMI Research Press, Ann Arbor (
          <year>1976</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Jedlitschka</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pfahl</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Reporting Guidelines for Controlled Experiments in Software Engineering</article-title>
          .
          <source>Proc. of ACM/IEEE Intern. Symposium on Software Engineering</source>
          <year>2004</year>
          (ISESE
          <year>2004</year>
          ),
          <fpage>261</fpage>
          -
          <lpage>270</lpage>
          . IEEE (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Khatri</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vessey</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramesh</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clay</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Park</surname>
          </string-name>
          , S.-J.:
          <article-title>Understanding Conceptual Schemas: Exploring the Role of Application and IS domain Knowledge</article-title>
          .
          <source>Information Systems Research</source>
          <volume>17</volume>
          ,
          <fpage>81</fpage>
          -
          <lpage>99</lpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Kitchenham</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          et al.:
          <article-title>Evaluation guidelines for reporting empirical software engineering Studies</article-title>
          .
          <source>Empirical Software Engineering</source>
          <volume>13</volume>
          ,
          <fpage>97</fpage>
          -
          <lpage>121</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Kitchenham</surname>
            ,
            <given-names>B.A.</given-names>
          </string-name>
          et al.:
          <article-title>Preliminary Guidelines for Empirical Research in Software Engineering</article-title>
          .
          <source>IEEE Transactions on Software Engineering</source>
          <volume>28</volume>
          ,
          <fpage>721</fpage>
          -
          <lpage>734</lpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Choi</surname>
            ,
            <given-names>B.G.</given-names>
          </string-name>
          :
          <article-title>A Comparative Study of Conceptual Data Modeling Techniques</article-title>
          .
          <source>J. of Database Management</source>
          <volume>9</volume>
          ,
          <fpage>26</fpage>
          -
          <lpage>35</lpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Mook</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Classic experiments in psychology</article-title>
          . Greenwood,
          <string-name>
            <surname>Westport</surname>
          </string-name>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Palvia</surname>
            ,
            <given-names>P.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liao</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>To</surname>
          </string-name>
          , P.-L.:
          <article-title>The Impact of Conceptual Data Models on End-User Performance</article-title>
          .
          <source>J. of Database Management</source>
          <volume>3</volume>
          ,
          <fpage>4</fpage>
          -
          <lpage>15</lpage>
          (
          <year>1992</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Parsons</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cole</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>What do the pictures mean? Guidelines for experimental evaluation of representation fidelity in diagrammatical conceptual modelling techniques</article-title>
          .
          <source>Data &amp; Knowledge Engineering</source>
          <volume>55</volume>
          ,
          <fpage>327</fpage>
          -
          <lpage>342</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Patig</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A Practical Guide to Testing the Understandability of Notations</article-title>
          .
          <source>Proc. 5th AsiaPacific Conf. on Conceptual Modelling (APCCM</source>
          <year>2008</year>
          ).
          <source>CRPIT Volume 79. ACS</source>
          , (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Pickard</surname>
            ,
            <given-names>L.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kitchenham</surname>
            ,
            <given-names>B.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>P.W.:</given-names>
          </string-name>
          <article-title>Combining empirical results in software engineering</article-title>
          .
          <source>Information and Software Technology</source>
          <volume>40</volume>
          ,
          <fpage>811</fpage>
          -
          <lpage>812</lpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>P.W.</given-names>
          </string-name>
          : Fundamentals of Experimental Psychology, 2nd ed., Prentice-Hall, Englewood
          <string-name>
            <surname>Cliffs</surname>
          </string-name>
          (
          <year>1981</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Rosenthal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , DiMatteo, M.R.:
          <article-title>Meta-Analysis: Recent Developments in Quantitative Methods for Literature Reviews</article-title>
          .
          <source>Annual review of psychology 52</source>
          ,
          <fpage>59</fpage>
          -
          <lpage>82</lpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Sarafino</surname>
            ,
            <given-names>E.P.</given-names>
          </string-name>
          : Research Methods:
          <article-title>Using Processes and Procedures of Science to Underssand Behavior</article-title>
          . Pearson, Upper Saddle River (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Sloman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <source>Interactions Between Philosophy and Artificial Intelligence. Artificial Intelligence</source>
          <volume>2</volume>
          ,
          <fpage>209</fpage>
          -
          <lpage>225</lpage>
          (
          <year>1971</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>