<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Assessing the Frequency of Empirical Evaluation in Software Modeling Research</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jeffrey C. Carver</string-name>
          <email>carver@cs.ua.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eugene Syriani</string-name>
          <email>esyriani@cs.ua.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jeff Gray</string-name>
          <email>gray@cs.ua.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Alabama, Department of Computer Science Tuscaloosa</institution>
          ,
          <addr-line>Alabama</addr-line>
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <abstract>
        <p>Researchers in software modeling often publish new tools or methodologies that claim to offer some advantage to the modeling community. There are different methods by which those claims can be evaluated. In this paper, we examine the degree to which such claims are supported by various types of empirical evaluation. We surveyed five editions of the MoDELS conference from 2006-2010, as well as the primary conference that focuses on empirical software engineering (the International Symposium on Empirical Software Engineering and Metrics), to understand the frequency with which empirical evaluation has been reported in the software modeling community. Our summary of 266 MoDELS papers found that 195 (73%) of the publications performed no empirical evaluation. This paper summarizes our findings from that survey and offers recommendations for improving the awareness and need for empirical evaluation in software modeling research.</p>
      </abstract>
      <kwd-group>
        <kwd>Empirical software engineering</kwd>
        <kwd>Model-driven Engineering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Research into software modeling has attracted many creative and transformative ideas
over the past decade, ranging from new methods for defining languages and
transforming their model instances, to higher level performance analysis and
verification tools that abstract the essence of some system property. Although the
novelty of software modeling research has led to numerous advances, the collective
body of work in this area has not always followed the typical tenets of a scientific
discipline. One of the key precepts of scientific investigation is the ability to repeat an
experiment to verify that some new scientific discovery can be confirmed under
numerous scenarios. For most contributions in model-driven engineering, some new
tool or technique is often proposed and discussed through an illustrative case study,
but generally is not evaluated at the level of rigor assumed for a traditional empirical
evaluation.</p>
      <p>Our suspicion about the level of empirical studies in modeling research led to this
summary paper that analyzes the degree of empirical evaluation in software modeling
research. To approach this topic, we analyzed the most recent five editions (from
2006 through 2010) of the most influential conference in software modeling – the
conference on Model-Driven Engineering, Languages and Systems (MoDELS). Two
of the authors of this paper (Gray and Syriani) have themselves published papers at
this conference that did not contain an empirical evaluation. We were curious about
the extent to which this practice is common in the software modeling community. In
addition to observing contributions at MoDELS, we also considered the prevalence of
modeling papers at a venue focused on empirical software engineering. The
remainder of this paper summarizes our findings from an analysis of 266 MoDELS
papers. Our suspicions were confirmed by our analysis, which suggests that a large
majority of research papers in the modeling community fail to provide any level of
empirical evidence to support the claims of benefit made in those papers.</p>
      <p>The next section of this paper provides an overview of empirical studies and the
methodology that we used in conducting our analysis of MoDELS papers. Section 3
presents the results of our analysis of the MoDELS conference and our analysis of
software modeling papers that have appeared in the flagship empirical software
engineering conference, the International Symposium on Empirical Software
Engineering and Measurement (ESEM). Section 4 discusses the results of the analysis
in more detail. Finally, we conclude the paper in Section 5.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>Overview of Empirical Studies and Methodology</title>
      <p>For a new modeling tool or technique to become used, the developer of the tool or
technique must demonstrate its value. Although a proof-of-concept or illustrative
example are important first steps in establishing the usefulness of a technique or tool,
claims about the usefulness of modeling techniques and tools cannot be fully
validated without the use of various types of empirical studies. An empirical study is a
validation method that draws conclusions based on observations (as opposed to proof,
argumentation, or expert opinion).</p>
      <p>
        In the larger software engineering community, empirical studies have commonly
been used to understand developer behavior in a number of important areas. There is
an entire sub-community focused on validating software engineering claims via
empirical study. This sub-community has a conference (ESEM), a Springer journal
(Empirical Software Engineering) and a number of handbooks [3], [
        <xref ref-type="bibr" rid="ref10">9</xref>
        ], [
        <xref ref-type="bibr" rid="ref15">14</xref>
        ]. The first
author of this paper comes from this community.
      </p>
      <p>The goal of this investigation was to determine how many papers had some type of
empirical evaluation of their claims. We realize that there are evaluation methods
other than empirical studies (e.g., demonstration/proof-of-concept or theoretical
proof). But, in this paper, we focus only on empirical evaluation. Among the three
authors, two are experts in the modeling domain and one is an expert in the empirical
software engineering domain. Working together, we were able to complement each
other’s expertise to perform this analysis.</p>
      <p>We used a three-step process for identifying which papers contain an empirical
component. The first step was to develop an initial characterization scheme. Next, the
two modeling experts individually analyzed the proceedings of various years of the
MoDELS proceedings to identify and classify the papers. Third, the empirical studies
expert reviewed the papers identified in step two and validated the classification of
those papers. Step 3 resulted in some modifications to the characterization scheme.
The remainder of this section describes each step in more detail.</p>
      <sec id="sec-2-1">
        <title>2.1. Step 1 – Develop an initial characterization scheme</title>
        <p>We began with the assumption that there are two types of empirical studies: those that
are more analytical (i.e., perform some type of analysis of a tool and its properties
without using humans) and those that are human-based (i.e., they involved studying
one or more people using a modeling technique). For each type of study, we created
two categories: “non-rigorous” and “rigorous.” The difference between rigorous and
non-rigorous was subjective and ill-defined at this first stage of analysis.</p>
        <p>Because we had no preconceived notions of the results of the literature search, this
initial characterization scheme was necessarily vague. We realized that after
examining the actual papers, we would have to refine the characterization scheme to
accurately describe the identified papers.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Step 2 – Identification of candidate papers</title>
        <p>The two modeling experts divided the five years of MoDELS proceedings between
them and individually analyzed all of the papers. For each paper, they first determined
whether there was any type of empirical study and whether it was human-based. At
this stage, they also made a subjective determination as to whether a paper was
rigorous. After this step, we developed a spreadsheet that characterized each paper
into one of five categories: no empirical study, non-rigorous non-human, rigorous
non-human, non-rigorous human and rigorous human.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Step 3 – Review of candidate papers and finalization of characterization</title>
        <p>The empirical software engineering expert then reviewed each paper that the
modeling experts identified during Step 2 as having an empirical study. The goal of
this process was to provide a second observation to validate the characterization from
Step 2. During the review, it quickly became apparent that our initial characterization
scheme was inadequate. We refined the initial characterization as follows.</p>
        <p>First, we more clearly defined the term “empirical study.” Some of the candidate
papers identified during Step 2 really contained just a demonstration or
implementation of the new tool or technique rather than an empirical study. In fact,
several MoDELS papers had an “Evaluation” section that was merely a discussion of
lessons learned, rather than what those in the empirical software engineering
community would call an empirical study. We clarified the definition of what we
considered as an empirical study to exclude papers that clearly did not gather any type
of data to evaluate the proposed tool or technique.</p>
        <p>In reviewing the papers, we identified two types of empirical papers:
1. Papers that propose a new tool or technique and then perform some type of
evaluation of it.
2. Papers that gather information about the use of modeling techniques in
practice. These papers do not propose new approaches; rather, they study
existing approaches or survey users to develop requirements for tools or
techniques that may be needed. We call these papers Formative Case
Studies, as opposed to the Illustrative Case Studies that just illustrate the use
of a new tool or technique.</p>
        <p>Second, we refined the original characterization scheme to define more concretely the
categories into which the papers could be classified. The revised characterization
scheme is as follows:
1. No empirical evaluation – the paper did not provide any type of empirical
evaluation of the proposed tool or technique (this, unfortunately, represented
the overwhelming majority of the papers we analyzed).
2. Non-human evaluation of the proposed tool/technique only – the paper
offered some type of empirical evaluation (e.g., performance or correctness)
of the proposed tool, but did not compare the new tool against other tools or
benchmarks.
3. Non-human evaluation of proposed tool/technique by comparison with other
tools – the paper provided an empirical evaluation by comparing the
proposed tool/technique against one or more existing tools or benchmarks to
evaluate some aspect of the new tool/technique.
4. Observation of humans using new tool/technique – the paper discussed and
analyzed the results from the use of the new tool/technique by one or more
people other than the authors of the paper.
5. Human-based controlled experiment – the paper described a controlled
experiment where the new tool/technique was compared against one or more
existing approaches through a human-based controlled experiment where
each participant used one or more approaches and provided data that could
be analyzed to evaluate the new tool/technique.</p>
        <p>6. Formative case study – as defined above.
3.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results of Literature Survey</title>
      <p>This section summarizes the results of our survey of the MoDELS papers and of the
modeling papers that appeared in the ESEM conference.</p>
      <sec id="sec-3-1">
        <title>3.1. Results of MoDELS Survey</title>
        <p>In the empirical evaluations conducted, we analyzed a total of 266 papers published at
MoDELS from 2006-2010. The complete analysis of the papers took approximately
18 hours of observation and recording. Table 1 summarizes the results of this
assessment.</p>
        <p>It is very clear that, for each year, the number of papers without any evaluation
was predominant: ranging from 61% in 2010 up to 82% in 2006. However, the
tendency seems to suggest a rising awareness and influence of the need for empirical
studies, as we note an average decrease of about 4% each year in the number of
papers with no evaluation (there is a 21% drop in the “No Evaluation” category from
the beginning of our study period to the end of the period over the five years
observed). We have no direct evidence for the cause of this improvement, but
feedback sent to authors on reviews over the period of the study may suggest the
emerging demand among the Program Committee for more rigorous evaluation.</p>
        <p>Those papers that did have some form of empirical study were often restricted to
simple evaluations of performance or correctness of the proposed tool/technique
without comparing it to other results (41% of the those papers describing an empirical
study were in the “No comparison” category). The papers in 2007 seem to be the only
exception, where 11% of all papers addressed comparisons with other tools or
benchmarks.</p>
        <p>On average, about 11% of the papers were supported by empirical studies
involving humans. In this category, 42% of the papers contained controlled
experiments, representing not more than 7% of all papers (years 2008 and 2010). The
number of papers where the evaluation was observed by at least one external
participant has been quite steady at about 3% of all papers. Formative case studies are
gaining popularity with up to 6% of all the papers in 2010.</p>
        <p>The “Total” row (at the bottom of Table 1) shows the portions occupied by each of
the categorizations defined in Section 2 across all years. Although 73% of the papers
published at MoDELS do not contain an evaluation, 10% of the papers only evaluate
their own tool without any comparison to other approaches. Thus, only the remaining
17% of the papers involve an empirical evaluation of the proposed tool or technique.
However, according to Fig. 1, this number is increasing every year: up to 24% in
2010. This trend may suggest that authors are aware of the lack of empirical evidence
in the modeling community and are now working on filling this gap.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2 Results of ESEM Survey</title>
        <p>As evidenced by the discussion in the previous section, the MoDELS conference
appears to be focused mainly on proposing new tools and techniques without rigorous
evaluation. To the authors’ credit, the paper length restrictions of the LNCS format
used in MoDELS leave little space for discussion of formal evaluation. The ESEM
conference is a general software engineering conference that focuses on the empirical
evaluation of newly proposed techniques across all software engineering topics. We
analyzed the same five years of the ESEM conference to determine whether more
formal evaluations of modeling research were being published there. To identify the
set of papers, we queried the proceedings using the following keywords: “UML,”
“DSL,” “metamodel,” “model,” and “model-driven.” The modeling experts then
vetted the results of the search to ensure that the papers were within the scope of
software modeling.</p>
        <p>Based on this analysis, we can make a few interesting observations. The ESEM
conference has three types of papers: Regular Papers, Short Papers, and Posters. In
total, we only found 17 modeling papers across the five years that we analyzed
ESEM. Of those 17 papers, only 4 were Regular Papers (10 pages IEEE or ACM
format) out of a total of 178 Regular Papers and 10 were Short Papers (4 pages) out of
a total of 118 Short Papers. Thus, even when software modeling papers are published
in an empirical venue, they tend to be shorter and do not provide a high-level of
detail. In analyzing the five years of ESEM, we were not able to identify any trends
that would suggest the prominence of modeling papers is increasing in the empirical
software engineering community. One final observation, in comparing the author lists
and titles of the ESEM papers against the empirical MoDELS papers, we found very
little overlap; only one paper seemed to be about the same tool or technique. Thus, the
cross-pollination of results across the two communities seems to be very low.
4.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Observations from Our Survey</title>
      <p>This section provides a summary of our observations about the papers that focused on
controlled experiments and formative case studies.</p>
      <sec id="sec-4-1">
        <title>4.1. Controlled Experiments</title>
        <p>
          Across the five years of the MoDELS conference, we found twelve controlled
experiments [1], [2], [5], [6], [7], [8], [
          <xref ref-type="bibr" rid="ref11">10</xref>
          ], [
          <xref ref-type="bibr" rid="ref12">11</xref>
          ], [
          <xref ref-type="bibr" rid="ref13">12</xref>
          ], [
          <xref ref-type="bibr" rid="ref14">13</xref>
          ], [
          <xref ref-type="bibr" rid="ref16">15</xref>
          ], [
          <xref ref-type="bibr" rid="ref17">16</xref>
          ]. This category
of papers serves as an example of the types of papers that we feel should be more
prevalent within the MoDELS community. In this section, we provide a brief
discussion of some of the trends observed in these controlled experiment papers.
Overall, the level of detail reported by the authors of these papers is quite low. We
realize that this level of reporting is likely affected by paper length restrictions and the
need to fully describe the newly proposed tool or technique as the core contribution of
the paper. Although we do not have the space to evaluate the quality of each study in
detail, there are two important factors that are relatively easy to evaluate: 1) the
number of participants, and 2) whether the participants were students or professionals.
        </p>
        <p>In terms of the number of participants in the studies, one half of the identified
papers had less than 25 participants, and only two studies had more than 50
participants. Furthermore, one study did not even report the number of participants. In
terms of the type of participant, only one study had professionals as a portion of the
participants. The overwhelming majority of the studies relied on undergraduates with
only a few using graduate students. Over 33% of the studies did not specify whether
the participants were students or professionals. The use of student participants is not
necessarily bad, but researchers need to make a clear case as to why student
participants are a valid population for the question under investigation [4].</p>
        <p>There does not appear to be a significant trend in the number of controlled
experiments reported. From 2006 through 2008, the number was increasing. Then,
there was a large drop of such experiments in 2009. The percentage of controlled
experiments in 2010 was equivalent to the percentage reported in 2008. Even in the
best years, only 7% of the papers reported controlled experiments. In general, we
would like to see an increase in both the frequency and diversity of controlled
experiments within the modeling community.
4.2.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Formative Case Studies</title>
        <p>Across the five years of the MoDELS conference, we found ten Formative Case
Study papers. There were two types of Formative Case Studies. First, there were four
studies that did not involve humans. These studies tended to analyze some existing
source code to understand how various modeling tools would or would not work
effectively. Second, there were six studies that focused on humans. These studies
mostly used a survey method to understand how existing tools were not meeting the
needs of developers. The output of many of these studies was a set of requirements
for new tools that were needed. Contrary to the Controlled Experiments, which
focused heavily on student participants, the Formative Case Studies were focused
more on industrial settings. Similar to the Controlled Experiments, we would also like
to see additional Formative Case Studies that provide input to tool and method
developers to help ensure that their work is relevant to the needs of practitioners.
5.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>This paper provides evidence that the rigor of empirically validated research in
software modeling is rather weak and should be a focus of future authors of MoDELS
papers. The high-level of incidence of papers with no evaluation is somewhat
alarming when compared to other software engineering venues (e.g., ICSE) where
empirical evaluation is more expected as a scientific contribution. Overall, the level of
empirical evaluation as seen in the software modeling community is quite low for a
scientific and engineering discipline. A goal of this paper is to raise the awareness of
this issue to assist in progressing the area of software modeling with a more scientific
underpinning. Our own future work will include a similar analysis of papers in the
software modeling community’s flagship journal – Software and Systems Modeling.</p>
      <p>As part of this work, we posit that there is a need for more controlled experiments
within the modeling community. We realize that there are at least three factors that
are hindering these types of studies being conducted. First, many researchers in the
modeling community may lack the background or training to carry out empirical
studies. This situation is evidenced by the fact that authors frequently mention
“validation experiments” which are nothing more than the application of the findings
or a toy example. Second, many researchers in the modeling community are more
interested in creating new tools and techniques than they are in performing the
rigorous evaluation of those techniques. Third, given the length restrictions of the
formatting style in the MoDELS conference, there is often not adequate space to
discuss both the new tool or technique and its validation, so most researchers seem to
opt for devoting space to the definition of the tool or technique as representing the
core contribution of their paper.</p>
      <p>Our goal in this paper is to stress the importance of building a culture that values
and expects empirical validation of newly proposed tools and methods. To help
facilitate this goal, we propose the following solutions to the problem. First,
researchers in the modeling domain who are interested in conducting appropriate
empirical evaluations themselves need to collaborate more often with researchers who
have expertise in empirical evaluation of software engineering methods (as the
authors of this paper are doing). Such a collaboration allows both types of researchers
to do what they are interested in and what they do best. Second, we suggest that more
rigorous empirical evaluations of modeling research be published in the ESEM
conference, where the focus is on the empirical evaluation, to cross-pollinate the
contributions of the modeling community with those explicitly working in empirical
techniques. In that venue, authors can devote more space to describing the evaluation
and interpreting the results. A somewhat radical suggestion is to afford MoDELS
authors an additional two to three pages of space for any paper that includes a more
rigorous evaluation based on an empirical study.</p>
      <p>A spreadsheet representing the results of our analysis of MoDELS conferences,
and a summary of the papers analyzed for the ESEM conferences, is available at:
http://www.cs.ua.edu/~carver/Data/2011/EESSMOD/
Acknowledgments. This research was supported in part by NSF CAREER award
CCF-1052616.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>(eds.) Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>6395</volume>
          , pp.
          <fpage>303</fpage>
          -
          <lpage>317</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Oslo</surname>
          </string-name>
          , Norway (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Almeida da Silva</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mougenot</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bendraou</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          et al.:
          <article-title>Artifact or Process Guidance, an Empirical Study</article-title>
          . In: Petriu,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Rouquette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            and
            <surname>Haugen</surname>
          </string-name>
          , Ø. (eds.)
          <source>Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>6395</volume>
          , pp.
          <fpage>318</fpage>
          -
          <lpage>330</lpage>
          .Oslo, Norway (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Boehm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rombach</surname>
            ,
            <given-names>H. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zelkowitz</surname>
            ,
            <given-names>M. V.</given-names>
          </string-name>
          :
          <article-title>Foundations of Empirical Software Engineering: The Legacy of Victor R</article-title>
          . Basili. Springer (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Carver</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaccheri</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morasca</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          et al.:
          <article-title>A Checklist for Integrating Student Empirical Studies with Research and Teaching Goals</article-title>
          .
          <source>Empirical Software Engineering</source>
          ,
          <volume>15</volume>
          (
          <year>2010</year>
          )
          <fpage>35</fpage>
          -
          <lpage>59</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          4735, pp.
          <fpage>76</fpage>
          -
          <lpage>90</lpage>
          . Nashville,
          <string-name>
            <surname>TN</surname>
          </string-name>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Fuhrmann</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , &amp; von Hanxleden, R.:
          <article-title>Taming Graphical Modeling</article-title>
          . In: Petriu,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Rouquette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            and
            <surname>Haugen</surname>
          </string-name>
          , Ø. (eds.)
          <source>Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>6394</volume>
          , pp.
          <fpage>196</fpage>
          -
          <lpage>210</lpage>
          . Oslo, Norway (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Genero</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cruz-Lemus</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caivano</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          et al.:
          <article-title>Assessing the Influence of Stereotypes on the Comprehension of UML Sequence Diagrams: A Controlled Experiment</article-title>
          . In: Czarnecki,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Ober</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Bruel</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , et al (eds.)
          <source>Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>5301</volume>
          , pp.
          <fpage>280</fpage>
          -
          <lpage>294</lpage>
          . Toulouse, France (
          <year>2008</year>
          ) Gravino,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Scanniello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Tortora</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          :
          <article-title>An Empirical Investigation on Dynamic Modeling in Requirements Engineering</article-title>
          . In: Czarnecki,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Ober</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Bruel</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , et al (eds.)
          <source>Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>5301</volume>
          , pp.
          <fpage>615</fpage>
          -
          <lpage>629</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Toulouse</surname>
          </string-name>
          , France (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          9.
          <string-name>
            <surname>Juristo</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Moreno</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Lecture notes on empirical software engineering</article-title>
          . World Scientific,
          <string-name>
            <surname>Singapore</surname>
          </string-name>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          10.
          <string-name>
            <surname>Lange</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>DuBois</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chaudron</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          et al.:
          <article-title>An Experimental Investigation of UML Modeling Conventions</article-title>
          . In: Nierstrasz,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Whittle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Harel</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          , et al (eds.)
          <source>Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>4199</volume>
          , pp.
          <fpage>27</fpage>
          -
          <lpage>41</lpage>
          . Genova,
          <string-name>
            <surname>Italy</surname>
          </string-name>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          11.
          <string-name>
            <surname>Lucrédio</surname>
          </string-name>
          , D.,
          <string-name>
            <surname>de M. Fortes</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Whittle</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>MOOGLE: A Model Search Engine</article-title>
          . In: Czarnecki,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Ober</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Bruel</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , et al (eds.)
          <source>Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>5301</volume>
          , pp.
          <fpage>296</fpage>
          -
          <lpage>310</lpage>
          . Toulouse, France (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          12.
          <string-name>
            <surname>Mäder</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Cleland-Huang</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A Visual Traceability Modeling Language</article-title>
          . In: Petriu,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Rouquette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            and
            <surname>Haugen</surname>
          </string-name>
          , Ø. (eds.)
          <source>Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>6394</volume>
          , pp.
          <fpage>226</fpage>
          -
          <lpage>240</lpage>
          . Oslo, Norway (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          13.
          <string-name>
            <surname>Prochnow</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp; von Hanxleden, R.:
          <article-title>Statechart Development Beyond WYSIWYG</article-title>
          . In: Engels,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Opdyke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          , et al (eds.)
          <source>Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>4735</volume>
          , pp.
          <fpage>635</fpage>
          -
          <lpage>649</lpage>
          . Nashville,
          <string-name>
            <surname>TN</surname>
          </string-name>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          14.
          <string-name>
            <surname>Shull</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sjøberg</surname>
            ,
            <given-names>D. I. K.</given-names>
          </string-name>
          :
          <article-title>Guide to Advanced Empirical Software Engineering</article-title>
          . Springer (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          15.
          <string-name>
            <surname>Stålhane</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Sindre</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Safety Hazard Identification by Misuse Cases: Experimental Comparison of Text and Diagrams</article-title>
          . In: Czarnecki,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Ober</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Bruel</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , et al (eds.)
          <source>Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>5301</volume>
          , pp.
          <fpage>721</fpage>
          -
          <lpage>735</lpage>
          . Toulouse, France (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          16.
          <string-name>
            <surname>Yue</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Briand</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Labiche</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>A Use Case Modeling Approach to Facilitate the Transition towards Analysis Models: Concepts and Empirical Evaluation</article-title>
          . In: Schürr,
          <string-name>
            <given-names>A.</given-names>
            and
            <surname>Selic</surname>
          </string-name>
          , B. (eds.)
          <source>Model Driven Engineering Languages and Systems</source>
          , LNCS vol.
          <volume>5795</volume>
          , pp.
          <fpage>484</fpage>
          -
          <lpage>498</lpage>
          . Denver, CO (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>