<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Empirical Validation of a Software Requirements Speci cation Checklist</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Martin de Laat</string-name>
          <email>martindelaat90@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maya Daneva</string-name>
          <email>m.daneva@utwente.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Twente</institution>
          ,
          <addr-line>Drienerlolaan 5, 7522 NB Enschede</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>[Context/Motivation] For areas such as Government IT Procurement, the Software Requirements Speci cation (SRS) often forms the basis for a public procurement. In these cases, having domain knowledge is often mutually exclusive to knowing RE. Domain experts lacking the necessary RE experience face issues assessing the quality and correctness of the SRS. This especially forms a problem for situations where the SRS acts as the base for a proposal and resulting contract, which is why often third party RE experts are consulted for evaluating the SRS beforehand. These experts are highly motivated to improve their process and o er a more uniform and better service. [Question/problem] Is our developed checklist a valid instrument to support the RE practitioner in the SRS validation process? [Principal ideas/results] We propose to empirically evaluate the checklist in a live study. Participants of our Evaluation Study are asked to simulate the validation of a sample SRS, guided by our checklist. We will analyze the data from a post-use questionnaire at the end of the session using a mixed method approach to assess the quality and usability of our checklist and its expected impact on the validation process. This assessment will be part of the overall validation of our checklist. [Expected Contribution] We expect to gain knowledge regarding the quality of our instrument. Secondly, this live study contributes to the validation, and thus the realization of a practical tool to be used by RE practitioners worldwide.</p>
      </abstract>
      <kwd-group>
        <kwd>Requirements engineering practice</kwd>
        <kwd>RE</kwd>
        <kwd>Software Requirements Speci cation</kwd>
        <kwd>validation</kwd>
        <kwd>checklist</kwd>
        <kwd>empirical study</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1.1</p>
    </sec>
    <sec id="sec-2">
      <title>Research design and objectives</title>
      <sec id="sec-2-1">
        <title>Research problem</title>
        <p>For areas such as Government IT Procurement, the Software Requirements
Speci cation (SRS) often forms the basis for a public procurement, e.g. through a
request-for-proposals process. In many situations, having domain knowledge is
mutually exclusive to knowing RE. This introduces two challenges, namely (1)
the persons with domain knowledge lacking the tools to properly create and/or
assess the SRS and (2) the RE experts potentially not having su cient domain
knowledge to cover. The use of an instrument supporting the SRS validation
process could be of help in both scenarios. Whenever a call for bid is involved,
the requirements speci cation will form a basis for the contract. Not only does
having a good SRS prevent ambiguity between the procurer and the supplier,
it also acts as a safeguard against any legal problems further down the road.
Because of this, validation is an integral part of the overall creation of an SRS.
It is of importance that a representation of the stakeholders c.q. domain experts
within the procurer's company understand and agree with their respective parts
of the SRS and can reasonably assess its quality. Often, third party experts are
contracted to validate the SRS before initiating the IT procurement process.
These experts are highly motivated to improve their process and o er a more
uniform and better service.
1.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Motivation and research goal</title>
        <p>
          Our motivation stems from improving the validation process in order to
contribute to better software being developed. The goal of this live study is to
evaluate our instrument (the checklist) and its e ect on the validation process of
a given SRS. The live study will most notably help identify strengths and
weaknesses, identify possible missing elements and give insights to the applicability
of the checklist by a variety of users [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
1.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Positioning of this live study within our research project</title>
        <p>
          The project resulted as a collaboration of the rst author and a large consulting
company in the Netherlands acting as a third party expert in the validation of
SRS's in the Government IT Procurement area. This research project started
within the University of Twente's course on Advanced Requirements
Engineering taught by the second author. The instrument is now being developed as part
of the Master's Project for Computer Science at the University of Twente.
Following [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], an instrument is rst designed and its design is justi ed, and then
evaluated gradually through sequenced applications in speci c contexts from
which lessons can be learned about the instrument's use and its improvement.
The checklist itself is developed following the steps outlined by Stu ebeam [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
The checks de ned in our instrument are based on the results of a literature
study and by combining elements of industry standards such as [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] (superseding
[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]) and [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Interview sessions with eld experts from the company mentioned
above where the instrument was showcased have already taken place and have
yielded positive results. Our checklist is now in a stage in which it should be
empirically evaluated for its quality and desired e ect. This proposal is focused
on the design of the live study at REFSQ 2018 where it will be the rst time it's
applied in a test setting. Future plans consist of testing the use of the instrument
in near-to-real-life settings.
1.4
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>Research objectives and research questions</title>
        <p>The main goal of this live study is to get a hands-on impression of the quality
of the instrument by having it applied by a variety of participants on a given
SRS. A side bene t of the live study is the announcement of our research to the
scienti c community and to gain traction amongst it. Speci cally, we want to
know what he strong and weak points of the checklist and whether there are any
missing elements (checks). To this end, we ask the following research questions:
RQ1 What is the quality of the checklist based on the selected criteria?
RQ2 What is the e ect of the instrument on the validation process?
1.5</p>
      </sec>
      <sec id="sec-2-5">
        <title>Research Methodology</title>
        <p>
          For this speci c live study, we consulted the empirical evaluation guidelines
provided by Wieringa[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], which are general to the evaluation of any software
engineering artifact or technology in context. In our evaluation study, participants
will be provided with the materials mentioned in 2 and asked to validate a sample
SRS with the help of our proposed instrument. Participants are invited to share
their perceptions and experiences by completing a questionnaire. The quality of
the instrument will be assessed based on criteria mentioned in section 2.2, rated
by the participants in a questionnaire. This data is analyzed with an extended
version of the mixed method approach by Martz[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. The checklist consists of 112
checks. To cover all checks within the timespan of the session they are split into a
'core'- and four 'non-core' segments. The core segment consists of 30 checks and
the remaining checks are evenly distributed into four groups. Each participant
will cover the 'core' segment and one of the 'non-core'-segments. During the live
session the four variations of materials will be distributed evenly.
1.6
        </p>
      </sec>
      <sec id="sec-2-6">
        <title>About the authors</title>
        <p>The rst author, de Laat, is a Master's student in the Computer Science program
of the University of Twente. He earned a bachelor's degree in Business &amp; IT from
the University of Twente. He has a broad professional experience in developing
business solutions for Microsoft SharePoint and is currently a part-time, Product
Designer and Ruby On Rails developer at Nedap Healthcare.</p>
        <p>The second author, Daneva, has been using empirical evaluation techniques
in her research since 2001. In 2012, she co-authored an online study that was
featured in the REFSQ program, and was about evaluation of a checklist for
reporting empirical research in RE.
2</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Live study design and logistics</title>
      <p>The study participants (50-60 expected) will be drawn from the pool of attendees
at REFSQ 2018. We require no particular participant pro le. For the research
to produce informative results, we need at least 20 participants so that we have
at least 5 participants working on each non-core segment of the checklist. All
participants will attend jointly the introduction to our live study. After this, the
participators will be divided into the available rooms to make sure everybody
has a seating spot and su cient space. Participants will receive one of four sets
of materials (see section 2) and asked to apply their respective set of checks
during a 60 minute session. Afterwards, during a 10 minute period, participants
are asked to ll in a questionnaire in order to provide perceptions, opinions and
their personal observations from using the checklist. The nal 10 minutes are
reserved for the conclusion, thanking and gathering of the materials.
2.1</p>
      <sec id="sec-3-1">
        <title>Equipment and materials involved in the study</title>
        <p>We need the help of the local organization to book su cient rooms where the
participants can work in isolation from each other. Participants will be provided
with (1) A sample (short) SRS, (2) their selected set of 'core' and 'non-core'
checks including relevant information from the reference manual, (3) an
answersheet with check-boxes to mark the results of the application of the check, (4)
the post-use questionnaire. Those that brought a mobile device will be requested
to ll in the questionnaire online using SurveyMonkey to speed up analysis. The
checks given to the participant are described in full, checks that are not will be
referenced by name only to allow for the participant to be able to say something
about the completeness. Everybody will be given the option to leave their contact
details if they desire to be kept informed about the research.
2.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Post-Use Questionnaire</title>
        <p>During the live study, all participants will be asked to ll in a questionnaire
consisting of
1. A set of background questions assessing: (a) the participant's experience with
SRS's, (b) which version of the non-core sets the participant received and
(c) an indication of how much time they spent on the core v.s. the non-core
part.
2. A critical feedback survey consisting of: (a) a set of closed questions and (b)
a set of open questions</p>
        <p>
          In the closed questions section the participants are requested to rate aspects
of the checklist based on a 9-point Likert scale where 1 means "strongly disagree"
and 9 means "strongly agree". The rst set of closed questions are derived from
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] and test the instrument on its: applicability to the full range of intended use,
clarity, comprehensiveness, concreteness, ease of use, fairness, parsimony and
item pertinence to the content area.
        </p>
        <p>
          The second set of questions, put together by the authors, expands on this
list and asks the participants whether they nd the instrument: to help prevent
task saturation [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], to t the work ow [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], can be completed in a su cient
period of time [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], contains su cient break points, will contribute to the process
of validating an SRS and will improve the quality of the validation output in a
real life scenario.
        </p>
        <p>
          The set of open questions consists of four open questions[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] where the
participant is asked to identify possible strengths, identify possible weaknesses, identify
items missing from the checklist and give recommendations for improving the
checklist.
        </p>
        <p>Finally, the participant is given the option to write down any nal thoughts
regarding the instrument or the session. On a separate document the participants
will be able to leave their contact details in case they wish to receive further
noti cations regarding the development of the instrument.
2.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Data collection and analysis</title>
        <p>
          Data resulting from both the paper and electronic questionnaires will be
combined and analyzed using o -line tooling on a secure computer. The recording
units (a single word or phrase describing a strength or weakness) resulting from
the open questions are grouped into categories identical to those of the closed
questions. The categories of the rst set of closed questions will attribute to
answering RQ1 and the second set to answering RQ2. Diverging stacked bar
charts and a Grouped Bar Chart will help visualize the quantitative responses
as described by Robbins, N. B., &amp; Heiberger, R. M. [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. In an e ort to compare
the responses from the open questions to those of the closed questions, we will
plot the 'Mean Scores' against the 'Net Strength' as proposed by Martz [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. The
nal facet of validity addressed in the investigation, will be consequential
validity. That is, providing evidence and rationale for evaluating the intended and
unintended consequences of interpretation and use [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
2.4
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>Threats to validity</title>
        <p>Internal threats for validity include the possibility that some participants have
experience in using checklists. Also, it may happen that some participants spend
more time on one section than the other (for instance, the core v.s. non-core
parts). This all may a ect our results. However, we plan to mitigate these threats
by collecting information on the prior checklist-related experience of the
participants and on how much time they spent on core v.s. non-core checks. If we get
less than 20 participants we would only be able to assess the checklist based on
it's 'core' contents. An external threat to validity is that the provided sample
SRS might not be representable to those used in a real scenario, a ecting for
instance the applicability of the instrument. We attempt to mitigate this by
selecting a reasonably standard sample SRS. Outside the scope of this live study,
we plan to mitigate this in future test sessions by having multiple experts test a
diverse set of requirements speci cations.
2.5</p>
      </sec>
      <sec id="sec-3-5">
        <title>Promotion, incentives and the sharing of the data</title>
        <p>We will work with the local organization so that our invitation to the study
is received by each REFSQ attendee upon registration. During the conference
we aim to share our ndings but might not have su cient time to analyze all
results, meaning we would have to limit our analysis to data derived from the
online responses. Apart from contributing to our research, participants will have
the option to voice their opinions and/or concerns regarding the development of
the checklist. Participants who chose to leave their contact information will be
kept informed about the progress of the instrument and will be sent a copy of
the resultant paper(s). After the conference the results of the live study will be
part of a Technical Report issued by the University of Twente. We are planning
of writing a journal paper in which we will motivate our checklist proposal and
resent its empirical evaluation. This live study results will be also included in
the journal submission.
2.6</p>
      </sec>
      <sec id="sec-3-6">
        <title>Ethics, con dentiality and consent</title>
        <p>The lling in of the questionnaire will be done anonymously and participants
will be asked to give consent on the analysis, manipulation and publication of
the data they provide during the live study. Apart from assessing the
participants' experience with requirements speci cations, no personally identi able
information will be asked of the participants during the test session. Optional
participants' contact details will be collected on a separate document to
guarantee the results of the questionnaire cannot be traced back to the participant.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Gawande</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Checklist Manifesto, The (HB)</article-title>
          .
          <source>Penguin Books India</source>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. IEEE: IEEE 830:
          <article-title>Recommended Practice for Software Requirements Speci cations</article-title>
          .
          <source>Tech. rep. (</source>
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>ISO</surname>
            ,
            <given-names>I.E.C.</given-names>
          </string-name>
          : IEEE. 29148:
          <fpage>2011</fpage>
          -
          <article-title>Systems and software engineering-Requirements engineering</article-title>
          .
          <source>Tech. rep. (</source>
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Martz</surname>
          </string-name>
          , W.:
          <article-title>Validating an evaluation checklist using a mixed method design</article-title>
          .
          <source>Evaluation and Program Planning</source>
          <volume>33</volume>
          (
          <issue>3</issue>
          ),
          <volume>215</volume>
          {
          <fpage>222</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Messick</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Validity of Psychological Assessment.
          <source>American Psychologist</source>
          <volume>50</volume>
          (
          <issue>9</issue>
          ),
          <volume>741</volume>
          {
          <fpage>749</fpage>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Murphy</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          :
          <article-title>Business is Combat: A Fighter Pilot's Guide to Winning in Modern Warfare</article-title>
          . Harper Collins (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Robbins</surname>
            ,
            <given-names>N.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heiberger</surname>
            ,
            <given-names>R.M.</given-names>
          </string-name>
          :
          <article-title>Plotting Likert and Other Rating Scales</article-title>
          . Joint Statistical Meetings pp.
          <volume>1058</volume>
          {
          <issue>1066</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Robertson</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robertson</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Volere.
          <source>Requirements Speci cation Templates</source>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Stu ebeam, D.L.:
          <article-title>Guidelines for developing evaluation checklists: the checklists development checklist (CDC). Kalamazoo, MI: The Evaluation Center</article-title>
          .
          <source>Retrieved on January 16</source>
          ,
          <year>2008</year>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Wieringa</surname>
          </string-name>
          , R.:
          <source>Design Science Methodology for Information Systems and Software Engineering</source>
          (
          <year>2014</year>
          ), http://portal.acm.org/citation.cfm?doid=
          <volume>1810295</volume>
          .
          <fpage>1810446</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Wieringa</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daneva</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Six strategies for generalizing software engineering theories</article-title>
          .
          <source>In: Science of Computer Programming</source>
          . vol.
          <volume>101</volume>
          , pp.
          <volume>136</volume>
          {
          <issue>152</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>