<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Barcelona, Catalunya, Spain, April</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>at the Feet of Giants: Recovering unavailable Requirements Quality Artifacts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Julian Frattini</string-name>
          <email>julian.frattini@bth.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lloyd Montgomery</string-name>
          <email>lloyd.montgomery@uni-hamburg.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Fucci</string-name>
          <email>davide.fucci@bth.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jannik Fischbach</string-name>
          <email>jannik.fischbach@netlight.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Unterkalmsteiner</string-name>
          <email>michael.unterkalmsteiner@bth.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniel Mendez</string-name>
          <email>daniel.mendez@bth.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>P. Spoletini, D. Amyot. Joint Proceedings of REFSQ-2023 Workshops, Doctoral Symposium, Posters &amp; Tools Track, and</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Blekinge Institute of Technology</institution>
          ,
          <addr-line>Valhallavägen 1, 371 41 Karlskrona</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Gulden, A. Wohlgemuth, A. Hess</institution>
          ,
          <addr-line>S. Fricker, R. Guizzardi, J. Horkof, A. Perini, A. Susi, O. Karras, A. Moreira, F. Dalpiaz</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>In: A. Ferrari</institution>
          ,
          <addr-line>B. Penzenstadler, I. Hadar, S. Oyedeji, S. Abualhaija, A. Vogelsang, G. Deshpande, A. Rachmann, J</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Netlight Consulting GmbH</institution>
          ,
          <addr-line>Sternstraße 5, 80538 Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Hamburg</institution>
          ,
          <addr-line>20146 Hamburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>fortiss GmbH</institution>
          ,
          <addr-line>Guerickestraße 25, 80805 Munich</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>17</volume>
      <issue>2023</issue>
      <abstract>
        <p>Requirements quality literature abounds with publications presenting artifacts, such as data sets and tools. However, recent systematic studies show that more than 80% of these artifacts have become unavailable or were never made public, limiting reproducibility and reusability. In this work, we report on an attempt to recover those artifacts. To that end, we requested corresponding authors of unavailable artifacts to recover and disclose them according to open science principles. Our results, based on 19 answers from 35 authors (54% response rate), include an assessment of the availability of requirements quality artifacts and a breakdown of authors' reasons for their continued unavailability. Overall, we improved the availability of seven data sets and seven implementations.</p>
      </abstract>
      <kwd-group>
        <kwd>Quality</kwd>
        <kwd>requirements quality</kwd>
        <kwd>open science</kwd>
        <kwd>availability</kwd>
        <kwd>artifacts</kwd>
        <kwd>data set</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Data sets and tools are often reported as important contributions to requirements quality
literature [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. However, a recent secondary study revealed that out of 57 primary studies, as
little as 12% of data sets and 19% of tools are currently publicly available [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The unavailability
of those artifacts has two major consequences. Firstly, empirical results are dificult to
reproduce,
which inhibits the process of strengthening the empirical evidence of scientific contributions.
Secondly, the presented artifacts are dificult to
reuse, which necessitates scientific progress to
nEvelop-O
LGOBE
      </p>
      <p>https://lloydm.io/ (L. Montgomery); https://dfucci.github.io/ (D. Fucci); https://www.lmsteiner.com/
restart over and over again instead of evolving from existing contributions. Ultimately, these
consequences inhibit the progress of the requirements quality research domain.</p>
      <p>
        Following open science practices in software engineering improves the accessibility of
artifacts and the preservation of future contributions [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] but it is dificult to apply them in retrospect.
Because of this, the accessibility of artifacts in past publications deteriorated over time [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The
resulting unavailability of artifacts [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] poses a significant challenge to artifact-dependent
research, like the requirements quality or the larger natural language processing for requirements
engineering (NLP4RE) domain. In this work, we set to recover unavailable requirements quality
artifacts by requesting authors to disclose them according to open science principles. Our main
contribution is the recovery of seven data sets and seven implementations. Nevertheless, 16 out
of 35 (46%) requests to authors remained unanswered, indicating that the research community
needs to emphasize further the importance of persistently archiving research data.
      </p>
      <p>The remainder of this manuscript is structured as follows: Section 2 introduces the background
both on the topic of open science and requirements quality research. Section 3 describes the
process and Section 4 the results of the recovery. We discuss these results in Section 5 before
concluding in Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <sec id="sec-2-1">
        <title>2.1. Open Science in Software Engineering</title>
        <p>
          Scientific work needs to be reproducible [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] to strengthen the evidence it contributes to a
ifeld of research [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Open science is the initiative of ensuring public availability of research
artifacts [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and, hence, facilitating reproducibility. Within open science, the facets of open
access for publications, open data for data sets, and open source for source code are most relevant
to software engineering [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], where each facet of open science entails diferent techniques
and best practices to disclose its respective type of research artifact. Several governmental
research funding agencies, including the European Union, made open access to scientific results
(including data, tools, etc.) mandatory1.
        </p>
        <p>In literature, Minocher et al. propose four attributes data recoveribility, data usability,
analytical clarity, and agreement of results and explicitly emphasize the sequential dependency of
those attributes—e.g., analytical clarity of data is meaningless if the data is not recoverable.</p>
        <p>
          Recent endeavors of incentivizing scholars to follow open science principles include open
science badges [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] and the registered reports [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. However, the software engineering research
community is still in the process of adapting open science principles [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], and the unavailability
of artifacts is common [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Prominent reasons for the unavailability of artifacts include the
sensitivity of data or corresponding authors changing their afiliation and consequently losing
access to their artifacts [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. While some reasons for the unavailability of artifacts (e.g., the
sensitivity of company-owned data) may well require significant efort to cope with, other
reasons (e.g., loss of artifact, lack of diligence) can be circumvented easily by following proposed
guidelines [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and making use of modern tools for artifact sharing.
        </p>
        <p>1https://research-and-innovation.ec.europa.eu/strategy/strategy-2020-2024/our-digital-future/open-science/
open-access_en</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Requirements Quality Literature</title>
        <p>
          Recent advances have established that artifacts produced in requirements engineering (RE)
have a significant impact on downstream software development activities [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], potentially even
causing project failure [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Consequently, requirements artifacts merit quality assurance [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
The requirements quality literature is dedicated to providing the understanding as well as the
support for measuring and improving the quality of requirements [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. One popular approach
to this is the proposal of quality factors. Requirements quality publications often formulate
one or more quality factors—e.g., the use of coordination ambiguity leading to divergent
interpretations [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]—annotate instances of that quality factor in a data set, and finally present an
implementation (i.e., an algorithm or full-fledged tool) to detect these instances automatically.
        </p>
        <p>
          These artifacts—both data sets and implementations—represent essential contributions
facilitating empirical research and technology transfer. While the (annotated) data sets are the
main driver for developing new and improving existing implementations for quality factor
detection, implementations are the tools to be deployed in industry for actual integration and
improvement of the software engineering process. The NLP4RE research domain, which applies
natural language processing (NLP) techniques to RE [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] and constitutes a large part of the
contributions to the requirements quality literature [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], is particularly focused on said delivery
and improvement of tools. In addition to the dependency of these NLP-powered tools on
the availability and reliability of training data, this puts the NLP4RE research domain on the
forefront of the open science challenge [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. The NLP4RE community is therefore particularly
aware of its dependency on the availability of artifacts [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
        <p>
          However, recent systematic studies revealed that a significant amount of these artifacts are
not available2 anymore or have never been [
          <xref ref-type="bibr" rid="ref1 ref12 ref2">12, 1, 2</xref>
          ]. Table 1 reports the availability status
of 57 data sets (D) and 36 implementations (I) extracted from the 57 primary studies of our
previously-published literature review on requirements quality factors [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Recovery Process</title>
      <p>
        The insight that the availability of requirements quality artifacts is insuficient [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] motivated
our objective to improve the state of open science in the requirements quality literature by
ensuring the recoverability of data, a necessary prerequisite for the reproducibility of scientific
work [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>In this section, we document the artifact recovery process along the undertaken steps. In
Section 3.1, we describe the selection of the sample of primary studies. We detail our approach
to contact corresponding authors in Section 3.2 and maintain correspondence with them in
Section 3.3. Finally, we document the evaluation of the recovery process and success in Section 3.4.
All produced data, scripts, and documentation are disclosed in our replication package4.
2Where available means a status of Upon request (see Table 1) or better.</p>
      <p>
        3In this context, we are using the term open source as commonly understood, not as used in the open science
framework [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which would imply adherence to the properties listed under open data.
      </p>
      <p>4Available at https://doi.org/10.5281/zenodo.7708571</p>
      <p>Explanation
The artifact is hosted in a service that satisfies the following criteria:
(1) immutable URL (cannot be altered by the author or someone
else), (2) permanent (the hosting organization has a mission to
maintain artifacts for the foreseeable future), (3) accessible (there
is a DOI pointing to the real data source URL), and (4) open-source
license (the artifact has a license which grants access and re-use)
[only for implementations] The implementation is available for all
to use, and the code base has been disclosed
[only for data sets] The data set is small enough that the authors
disclose the entire data set in the manuscript
The artifact is reachable now but is missing some of the Open Data
aspects (see above)
Authors claim the artifact is available upon request
A link to the artifact is contained in the paper, but it does not resolve
An artifact is presented, but no indication on how to access it is
provided
The authors state that an artifact exists but is private for some
reasons (such as industry collaboration with private data, etc.)
The artifact is proprietary, and access is granted upon payment</p>
      <sec id="sec-3-1">
        <title>3.1. Study sample selection and preparation</title>
        <p>
          We used convenience sampling since the primary studies on which we base our results were
selected based on expediency [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. In particular, we recovered artifacts from a set of primary
studies used to build an ontology of requirements quality factors [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. To develop this ontology,
we collected manuscripts reporting quality factors from an original set of publications reported
in another secondary study [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Extracting data sets and implementations from such publications
revealed the unfortunate state of artifact availability.
        </p>
        <p>
          We enhanced the data regarding data sets and implementations from our previous study [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]
with the following information.
        </p>
        <p>
          • Corresponding author: each artifact was associated with a corresponding author.
• Mention: each artifact was associated with its verbatim mention in the manuscript.
Additionally, we corrected information about one data set and three implementations that
persisted in the previous study [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>Using a spreadsheet, we collected data about
1. authors (n=35), specifying for each author the name and email address,
2. data sets (n=57), specifying for each data set its containing publication, its verbatim
mention, the corresponding author, and its current availability, and
3. implementations (n=36), specifying for each implementation its containing publication,
its verbatim mention, the corresponding author, and its current availability.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Approaching authors</title>
        <p>We created a Python script that automatically assembles one email for each corresponding
author. This email contained the following elements:
1. Header: an explanation of our endeavor and a request to contribute to open science (or
alternatively explain why this is impossible).
2. Artifact list: a list of artifacts contained in the publications of the authors that were not
open access.
3. Instructions: brief how to to properly disclose artifacts according to the open science
principles as well as the ofer to assist them in the process
4. Contact: a way to reach out to us.</p>
        <p>We approached the authors in a first mail on the 30 th of November 2022, followed by a
reminder on the 13th of December, and a final reminder on the 11 th of January 2023. For authors
that did not respond to our request until the final reminder, we additionally contacted their
co-authors to increase the likelihood of response. We concluded the recovery process on the
8th of February 2023, yielding a time frame of 70 days.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Correspondence</title>
        <p>We kept close contact with the authors we approached by responding in a window of 24 hours
within workdays. During this process, we clarified concerns and ofered our help. We processed
and recorded the information contained in the authors’ answers in a spreadsheet file. We
tracked the response status in an additional column, denoting the request as either undeliverable,
unanswered, answered, or completed. We labeled a recovery request as completed once the
corresponding author, for all their artifacts, either improved their availability or explained the
inability to recover or disclose them.</p>
        <p>Furthermore, we documented the dates of the first email sent, the first response received, and
the completion of the request alongside the number of emails sent by the author in addition
to the updated availability status of the artifacts and, eventually, the author’s explanation for
not taking the recommended actions. Two authors coded these explanations independently
and came to an absolute agreement on the types of reasons for non-recovery. When the
corresponding author’s email address was no longer used, we reached out via personal contacts
or social networks like Twitter and LinkedIn.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Evaluation</title>
        <p>To evaluate the artifact recovery process, we generated statistics of the following data from the
documentation in our tables.</p>
        <p>1. Correspondence (author response time and frequency) to evaluate the efort of the
recovery process.
2. Recovery request success (change in artifact availability) to evaluate the success of the
recovery process.
3. Reason for non-recovery (author responses excusing the recovery) to evaluate the reasons
inhibiting open access.</p>
        <p>We evaluated the data by generating descriptive statistics from our documentation.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <sec id="sec-4-1">
        <title>4.1. Correspondence</title>
        <p>Out of the 35 approached corresponding authors, 19 answered the recovery request, and 13
completed it. We could not reach three authors despite searching for a valid contact. The
distribution of correspondence status is visualized in Figure 1a. It took, on average, 14.6 days for
a corresponding author to reply to our request and 22.4 additional days to complete the request.
On average, a request was resolved in an exchange of 3 emails with the corresponding author.
The distributions of these statistics are visualized in Figure 1b and Figure 1c, respectively.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Artifact Recovery Success</title>
        <p>The corresponding authors improved the availability of seven data sets (four of which follow
open-access principles) and seven implementations (six following open-access principles). This
increases the availability of data sets from 12.3% (7/57, 1 open access) to 22.8% (13/57, 5 open
access) and the availability of implementations from 19.4% (7/36, 0 open access) to 30.6%
(11/36, 6 open access). Authors further confirmed the unavailability of 21 data sets and six
implementations and provided reasons for the inability to recover or disclose them.</p>
        <p>Figure 2 visualizes the success of the recovery request. The heatmap considers all artifacts
(data sets in Figure 2a and implementations in Figure 2b) where the corresponding author
completed the recovery request. The number in a cell represents the number of artifacts for
which the original availability (on the y-axis) has been updated to the new availability (on the
x-axis). The count of artifacts whose availability remained the same (e.g., because an author
confirmed that the artifact could not be made more available) is reported on the diagonal (shaded
gray). An improvement in the availability of an artifact contributes to cells to the right of the
diagonal, a deterioration of the availability to the left.</p>
        <p>
          For example, one implementation was previously available upon request [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. Now that the
authors disclosed the implementation following open access principles5, the entry moved three
cells to the right (see Figure 2b).
        </p>
        <p>Data sets</p>
        <p>Proprietary 0 0 0 0 0 0 0 0
ilityb NPorivLaintek 00 102 50 00 03 00 00 01
lilvaaaa UpBornokReenquLeinskt 00 20 20 00 00 00 00 20
iing Reachable Link 0 0 0 0 0 0 0 0
rOAvailable in Paper 0 0 0 0 0 0 0 1
Open Access Link 0 0 0 0 0 0 0 0
itrrryPoaep itrvPae iLkonN irLkkoennB tsoeenuqpRU licLkaaeehnbR liilrPaaaeenbp issccLkeennpA</p>
        <p>Updated availability vA O</p>
        <p>The inability to recover or disclose artifacts was reported as follows: among 21 unrecoverable
data sets, 15 were lost (i.e., the author could not find them anymore or the contact, whom the
author assumed had the data, was unreachable), and six could not be disclosed due to sensitive
contents. Among the six unrecoverable implementations, three became proprietary, and three
were lost.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>
        Within the 70-day time frame for the process, authors of requirements quality publications
recovered several data sets and implementations that are now available for reproduction of
scientific results and reuse in future projects. We referenced the recovered artifacts in the
requirements quality factor ontology [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]6 as well as our replication package to make them
accessible.
      </p>
      <p>
        Additionally, the authors confirmed the unavailability of several more artifacts. While
this does not actively improve the availability of artifacts for reproduction, it clarifies the
ambiguous status of several data sets and implementations. Overall, when authors answered a
recovery request, they either recovered their data or reported the inability to do so with helpful
5Now publicly available at https://doi.org/10.5281/zenodo.7484023
6See Content at http://reqfactoront.com/
explanations. Recovery requests failed due to (1) no response, (2) the artifact being lost, or
(3) the artifact containing sensitive information. We did not encounter other reasons for the
failure of a recovery request, which corroborates the goodwill of the sampled requirements
quality community in its commitment to open science. This stands in contrast to the experience
of other artifact recovery attempts, where researchers encountered reasons like requests for
reimbursements or not seeing any personal gain in the recovery [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        We cannot claim that our observations are universally valid for the software (requirements)
engineering community due to the limitations of our study. For one, the set of primary studies
was obtained via convenience sampling from a previous study [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This sample has known
limitations as several primary studies relevant to requirements quality literature are missing.
Hence, the results of recovery success and correspondence do not represent the complete
requirements quality literature and research community. Furthermore, our conclusion regarding
the status of correspondence, especially the status of unanswered and answered requests, is
limited by how we decided to approach corresponding authors. Using emails as the mean of
communication impedes the response rate since they are often abandoned with a change of
afiliation [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. The limited success of correspondence is a consequence of the time frame and
communication channels used in this study rather than an indicator of the research community’s
attitude towards open science.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>Both the credibility and reusability of previous publications in the requirements quality literature
have been impeded by the unavailability of data sets and implementations. We requested
corresponding authors of 57 publications to disclose their artifacts according to the open science
principles. We improved the availability of seven data sets and seven implementations, several
of which now follow open science principles.</p>
      <p>
        With this study, we want to raise awareness about the importance of recovering artifacts
associated with older publications. While adherence to the open science principles recently
rose thanks to comprehensive guidelines (see [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]) or community initiatives such as artifact
evaluation tracks at conferences, they are rarely applied retroactively to previous publications.
Furthermore, we hope that the material we created will support researchers in areas that heavily
rely on artifacts, such as NLP4RE, to recover more of them.
      </p>
      <p>Our agenda, in the scope of the requirements quality factor ontology7, includes providing a
central repository of updated information on the availability and location of relevant artifacts.
We invite researchers to contribute to this cause and strengthen the evidence in our field.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The KKS foundation supported this work through the S.E.R.T. Research Profile project at Blekinge
Institute of Technology. We additionally thank the reviewers for their valuable feedback upon
which the manuscript was improved.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Montgomery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fucci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bourafa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Scholz</surname>
          </string-name>
          , W. Maalej,
          <article-title>Empirical research on requirements quality: a systematic mapping study</article-title>
          ,
          <source>Requirements Engineering</source>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Frattini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Montgomery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fischbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Unterkalmsteiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mendez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fucci</surname>
          </string-name>
          ,
          <article-title>A live extensible ontology of quality factors for textual requirements</article-title>
          ,
          <source>in: 2022 IEEE 30th International Requirements Engineering Conference (RE)</source>
          , IEEE,
          <year>2022</year>
          , pp.
          <fpage>274</fpage>
          -
          <lpage>280</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Mendez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Graziotin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wagner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Seibold</surname>
          </string-name>
          ,
          <article-title>Open science in software engineering, in: Contemporary empirical methods in software engineering</article-title>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>477</fpage>
          -
          <lpage>501</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Anda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. I.</given-names>
            <surname>Sjøberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mockus</surname>
          </string-name>
          ,
          <article-title>Variability and reproducibility in software engineering: A study of four companies that developed the same system</article-title>
          ,
          <source>TSE</source>
          <volume>35</volume>
          (
          <year>2008</year>
          )
          <fpage>407</fpage>
          -
          <lpage>429</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tennant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Beamer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bosman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Brembs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. C.</given-names>
            <surname>Chung</surname>
          </string-name>
          , G. Clement,
          <string-name>
            <given-names>T.</given-names>
            <surname>Crick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dugan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dunning</surname>
          </string-name>
          , et al.,
          <article-title>Foundations for open scholarship strategy development (</article-title>
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Kidwell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. B.</given-names>
            <surname>Lazarević</surname>
          </string-name>
          , E. Baranski,
          <string-name>
            <given-names>T. E.</given-names>
            <surname>Hardwicke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Piechowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.-S.</given-names>
            <surname>Falkenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kennett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Slowik</surname>
          </string-name>
          , et al.,
          <article-title>Badges to acknowledge open practices: A simple, low-cost, efective method for increasing transparency</article-title>
          ,
          <source>PLoS biology 14</source>
          (
          <year>2016</year>
          )
          <article-title>e1002456</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Nosek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Ebersole</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>DeHaven</surname>
          </string-name>
          , D. T. Mellor,
          <article-title>The preregistration revolution</article-title>
          ,
          <source>Proceedings of the National Academy of Sciences</source>
          <volume>115</volume>
          (
          <year>2018</year>
          )
          <fpage>2600</fpage>
          -
          <lpage>2606</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gabelica</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bojčić</surname>
          </string-name>
          , L. Puljak,
          <article-title>Many researchers were not compliant with their published data sharing statement: mixed-methods study</article-title>
          ,
          <source>Journal of Clinical Epidemiology</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wagner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Fernández</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Felderer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vetrò</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kalinowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wieringa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pfahl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Conte</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>T.</given-names>
            <surname>Christiansson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Greer</surname>
          </string-name>
          , et al.,
          <article-title>Status quo in requirements engineering: A theory and a global family of surveys</article-title>
          ,
          <source>TOSEM</source>
          <volume>28</volume>
          (
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Mendez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wagner</surname>
          </string-name>
          ,
          <article-title>Naming the pain in requirements engineering: Design of a global family of surveys and first results from germany</article-title>
          ,
          <source>in: Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>183</fpage>
          -
          <lpage>194</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ezzini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Abualhaija</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Arora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sabetzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Briand</surname>
          </string-name>
          ,
          <article-title>Using domain-specific corpora for improved handling of ambiguity in requirements</article-title>
          ,
          <source>in: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>1485</fpage>
          -
          <lpage>1497</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Alhoshan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferrari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. J.</given-names>
            <surname>Letsholo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Ajagbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.-V.</given-names>
            <surname>Chioasca</surname>
          </string-name>
          , R. T. BatistaNavarro,
          <article-title>Natural language processing for requirements engineering: A systematic mapping study</article-title>
          ,
          <source>ACM Computing Surveys (CSUR) 54</source>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Dalpiaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferrari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Franch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Palomares</surname>
          </string-name>
          ,
          <article-title>Natural language processing for requirements engineering: The best is yet to come</article-title>
          ,
          <source>IEEE software 35</source>
          (
          <year>2018</year>
          )
          <fpage>115</fpage>
          -
          <lpage>119</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Minocher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Atmaca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bavero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>McElreath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Beheim</surname>
          </string-name>
          ,
          <article-title>Reproducibility improves exponentially over 63 years of social learning research (</article-title>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Baltes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ralph</surname>
          </string-name>
          ,
          <article-title>Sampling in software engineering research: A critical review and guidelines</article-title>
          ,
          <source>Empirical Software Engineering</source>
          <volume>27</volume>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>F.-L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Horkof</surname>
          </string-name>
          , L. Liu,
          <string-name>
            <given-names>A.</given-names>
            <surname>Borgida</surname>
          </string-name>
          , G. Guizzardi,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mylopoulos</surname>
          </string-name>
          ,
          <article-title>Engineering requirements with desiree: An empirical evaluation</article-title>
          ,
          <source>in: International Conference on Advanced Information Systems Engineering</source>
          , Springer,
          <year>2016</year>
          , pp.
          <fpage>221</fpage>
          -
          <lpage>238</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Wren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Grissom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Conway</surname>
          </string-name>
          ,
          <article-title>E-mail decay rates among corresponding authors in medline: The ability to communicate with and request materials from authors is being eroded by the expiration of e-mail addresses</article-title>
          ,
          <source>EMBO reports 7</source>
          (
          <year>2006</year>
          )
          <fpage>122</fpage>
          -
          <lpage>127</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>