<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshops, Doctoral Symposium, and
Poster &amp; Tools Track, Birmingham, UK</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>A Method to Deal with Social Bias and Desirability in Ethical Requirements</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Claudia Negri-Ribalta</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre de Recherche en Informatique, Paris I Panthéon-Sorbonne Université</institution>
          ,
          <addr-line>90 Rue de Tolbiac, 75013, Paris</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>2</volume>
      <fpage>1</fpage>
      <lpage>03</lpage>
      <abstract>
        <p>[Context] Ethical requirements are a growing area of importance for software. Yet, when dealing with these types of requirements from users, the subject might not always give an accurate representation of their requirements, due to social factors. [Question/problem] Which methods can help mitigate social desirability when dealing with ethical requirements? [Principal ideas/results] This article proposes the usage of factorial survey experiments (FSE) to work around social desirability when working with ethical requirements. FSE works with vignettes, which the RE practitioner presents to the subject, and experimentally varies them to comprehend how the subject reacts to diferent stimuli. It enables quantitative analysis of requirements and their specifications, adding explainability and transparency to the RE process. [Contribution] This article describes how to use FSE for ethical requirements, and its advantages. We also give an example of application, for which we share preliminary results. Our work opens the discussion for a possible framework using FSE for ethical requirements.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Requirements</kwd>
        <kwd>Ethics</kwd>
        <kwd>Methodology</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Discussion on the relationship of technology with ethics isn’t new [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Recently, there’s been
growing consideration of the ethical aspects of information systems (IS) as an area of research,
ranging from privacy concerns to online gambling [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. One question is how to include ethical
requirements from early stages in the software development [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ].
      </p>
      <p>
        However, discussing ethics in software development isn’t simple. As [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] have suggested,
engineers seem to think that ethical issues aren’t part of their job, and it is mostly the responsibility
of regulations, and non-engineers. Also, diferent stakeholders might have diferent ethical
paradigms, which can give rise to tensions between values [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and some questions might leads
to answers that can have social-desirability bias1 .
      </p>
      <p>In this paper, we discuss FSE as a method of allowing transparent and accountable system
design, particularly when dealing with ethical requirements. It proposes the usage of FSE, a
well-adopted tool from sociology and other social science used to study items such as beliefs,
intentions, perceptions, or elements that can have socially desirable answers. The experimental
nature of FSE allows the RE practitioner to statistically analyze how diferent specifications
interact with each requirement, without necessarily developing the software. This type of
analysis can help the RE practitioner explaining a system design, in a transparent and accountable
manner, while also keeping trace of the decision of why such design was chosen.</p>
      <p>The article is divided as follows: section Related work reviews previous work, section Research
Method introduces the reader to FSE and explains how FSE can be used for RE. In section Use
case, we share a practical example of FSE on studying trust ($TR), open-source ($OS), and data
protection ($DP) requirements in COVID-19 contact tracing apps. The papers shares future
work and conclusion in Future work and conclusion.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>
        There is growing literature that highlights the importance to include ethical concerns into
software design, from early stages [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref5">2, 5, 1, 3</xref>
        ]. According to [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], ethical requirements, or ethical
designing, can be understood as:
• the process by which an IS is developed by taking into consideration elements that can
be related to moral norms/values; OR
• when IS have consequences beyond the group that developed it; OR
• it is “related to visions about how to live the ethically good life and virtues involved in
that” [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] discussed how goal-based RE can be insuficient for ethical requirements, and recognizes
that given that not all requirements can be met, trade-ofs have to be done, which is an ethical
choice. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] presents how Value Sensitive Design (VSD) isn’t a well-integrated concept in the
RE community, compared to HCI, and proposes a methodology based on VSD to elicit ethical
requirements. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] propose an iterative framework on ethical requirements The authors of [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
study the relationship between ethical requirements and addictive technology and highlight
that future work should focus on all the aspects of the RE process and ethical requirements.
      </p>
      <p>
        This article diferentiates from [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], as FSE experimentally varies the vignettes and also assumes
that the ethical requirements have been discovered. FSE could help deal with the biases and
diferences of perception of diferent stakeholder groups, presented in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In comparison to
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], this article doesn’t present a list of categories of ethical requirements, not works with a
specific technology. Finally [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] reflects and discusses extensively the relationship technical and
ethical requirements, raising research question, but doesn’t propose a clear methodology or
framework for the topic.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Research methodology for ethical requirements</title>
      <p>
        FSE is a research method that is useful to investigate on beliefs, intentions, attitudes, and
subjects that have social desirability, among others [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]. It has been used in other areas of
research such as sociology, and psychology, among others. In requirement engineering, it has
been used by [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ].
      </p>
      <p>
        In brief, this research method prompts the subject to evaluate a set of vignettes that describe
a situation around the defined elements of interest. It asks the subject to rate diferent
experimentally varied vignettes, therefore behaving like an experiment (thus having internal validity).
By not asking directly to the subject to answer about a subject that might have a social-desirable
bias, but rather rating diferent vignettes (whose factors might have social-desirable bias) it is
possible to statistically analyze how the factors behave and, aif there is significant diference
between levels. For example, [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] shows how the relationship between sex, education, occupation,
citizenship and other variables afect the perception of fair income, through the usage of FSE.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Factorial Survey Experiments</title>
        <p>The first step in FSE is to specify which are the elements of interest of the research - such as
privacy - which are labeled as factors by the FSE. For RE, this could be the requirements(s) of
the system. The factors should have diferent levels, that is to say, they have diferent values or
specifications. For example, a researcher might be interested in seeing how the requirements
of data protection and security relate to each other, thus investigating which specification fits
better for the intentions of a system. It could help with the prioritization of requirements and
which specifications suits better for the stakeholders.</p>
        <p>Once the researcher defines the factors and their levels they must construct the vignettes.
The vignettes describe a situation, either real of hypothetical. There is no one way on how to
build vignettes, as these can be images, text, or video. Each vignette uses a specific and unique
combination of factors. The size of the vignette universe will depend on the number of factors
and their levels. For example, if there are 5 factors with 3 levels, and 1 factor with 4 levels
known as a 3541 - the universe is 972 vignettes. However, it is unreasonable to present this
amount of vignettes to a subject to rate them, and thus a sample of vignettes should be selected.</p>
        <p>
          There are diferent way for selecting a sample of vignettes such as random, eficient-design,
blocking, co-founding variables [
          <xref ref-type="bibr" rid="ref11 ref6 ref7">11, 7, 6</xref>
          ]. [
          <xref ref-type="bibr" rid="ref11 ref7">7, 11</xref>
          ] indicate that random sampling of vignettes has
several disadvantages from a statistical point of view, particularly they lose power of explanation.
Thus the literature suggests the usage of eficient design (such as D-eficiency) when doing a
fractional factorial survey, even when the perfect orthogonality objective is relaxed [
          <xref ref-type="bibr" rid="ref10 ref11 ref7">7, 11, 10</xref>
          ].
This type of design can be provided by specialized tools, such as R or STATA. In R, there are
specialized packages such as AlgDesign that have specialized functions for selecting the vignette
sample. It is also possible to present one vignette per subject, however this type of approach
can afect the statistical power of the model as it is no longer an experimental survey [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
        <p>
          The survey can be shared through diferent means, such as in-person or online. The decision
of which strategy to follow will depend on a case-by-case. The data can be analyzed using
diferent statistical models, such as mixed multilevel, ANOVA, Tobit models, etc [
          <xref ref-type="bibr" rid="ref10 ref7">7, 10</xref>
          ]. It is
important to note that if a subject rates several vignettes, the data isn’t independent and the
subject’s ID adds error to the model [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
        <p>The data analysis can happen at diferent levels: the vignette (Level 1 - L1) and
respondentspecific (Level 2 - L2). L1 allows to identify how the subject reacted to the diferent stimuli;
which is known as within-subject variables. For example, it is possible to analyze how the
specifications afect a specific requirement. L2 allows to analyze how the diferent interest
groups or control variables (such as gender or age) rated the vignettes, and analyze how the
groups diferentiate; this is the between-subject variable. Diferent statistical software ofers
options to carry out diferent types of statistical analysis and models for this type of data. For
example, the R software provides the "Ordinal" package, that ofers the clmm (Cummulative
Link Models) function that fits a mixed multilevel, which can be used to analyze the data.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. FSE and ethical requirements - How to use them</title>
        <p>FSE seems to be particularly interesting for ethical requirements. As stated in section 3.1, FSE
has a longstanding tradition of being used on topics that can have a socially desirable answer,
attitudes, or judgments.</p>
        <p>
          A similar case can be built for requirements, particularly ethical. In RE, FSE has been used
to study the relationship between security requirements and the perception of risk on users
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. From sociology, [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] shares that although most subjects would have answered that they
believe in equality of gender, the data obtained by FSE shows the contrary. Similarly, software
development could take a similar stance when dealing with topics or concepts such as gender
issues, privacy, fairness.
        </p>
        <p>
          By describing a set of realistic scenarios and experimentally varying them, the RE practitioner
can analyze the subjects’ attitude or reactions to the stimuli [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. As a consequence of the
experimental variation of the factors, FSE reflects the subjects’ reaction to diferent
specifications (the stimuli) without necessarily developing the whole system. Furthermore, given its
experimental and survey characteristics (if correctly designed, the internal and external validity)
the RE process can be replicated. This method may allow the RE practitioner to explain the
prioritization of a requirement, beyond ethical topics. As such, FSE gives accountability to the
RE process, as it helps the RE practitioner explain the rationale of a design.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Use Case - Using FSE for COVID-19 contact tracing apps</title>
      <p>We enquired about the willingness, from a user’s perspective, to download a COVID-19 contact
tracing app, using FSE research method. The requirements were chosen using review of the
literature available from September 2020 to May 2021. Parameters, inspired from literature
review, were as follows.</p>
      <p>• App provider: Government, private company, university and any combination of these (7
possible values) ($TR)
• Data Protection: high, basic, low ($DP)
• Open Source: open source, proprietary code ($OS)</p>
      <p>Given the universe of vignettes (42) and taking in consideration fatigue efects from subjects,
the universe was divided in decks divided BY the $OS factors. In other words, subjects would
receive a sample (or treatment) containing either open-source or proprietary code vignettes.
Therefore $TR and $DP are within-subject variables, while $OS is a between-subject variable.</p>
      <p>
        There were 2 decks of 21 vignettes, which can still be considered a big sample. Thus, we
modified the graphical interface and the deck was presented as 4 vignettes, each being divided
into 7 sub-sections, following a "table-like feeling". This was based on feedback received from
the testing phase. It used an 11 point Likert scale, following the advice to avoid censored
responses [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>The survey was tested with 80 participants, who answered that they filled out the survey
between 4-6 minutes and weren’t fatigued. Other feedback received was about wording, the
graphical interface as previously stated. Comments were integrated into the final survey version.</p>
      <p>The population of interest were french university students. Given that the data gathering
phase occurred while national lockdowns were in place, data was gathered from students living
in the Paris area. The FSE was answered by 434 subject, and after cleaning the data from rushers,
subjects that failed the attention test and those not living in Paris, the final dataset has 415
answers from diferent subjects.</p>
      <p>Our preliminary statistical analysis shows that it is possible to see relationships and diferences
between each specification and a requirement. In other words, given the experimental variation
of the vignettes, it is possible to see how each specification afects each requirement, either
negatively or positively. In this use case, the diferent levels $TR and $DP appear to be significant
at the moment of the willingness to download for subjects. In contrast, the levels of $OS don’t
seem to have a relationship with the willingness to download. The analysis is done in a
reproducible and falsifiable way, allowing other RE practitioners to check if they arrive to the
same results. These results could help in the development of governance models in COVID-19
contact tracing apps. The results and analysis will be published in future venues, where it will
discuss the impact of the details on the design and governance model of COVID-19 contact
tracing apps.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Future work and conclusion</title>
      <p>For future work, a framework for ethical requirements and design, using FSE, can be proposed.
Such framework could profit from the usage of chat bots for automatizing the process.
Furthermore, the tool could be based on the shiny package from , and help the RE practitioner define
the requirement, specifications, optimal design options and analysis. Finally, research could be
carried out on the usage of FSE for requirement prioritization, not just for ethical requirements.</p>
      <p>This article presents the usage of FSE as a method for RE practitioner to work around the social
desirability of ethical requirements, in a transparent, explainable and accountable manner. The
RE practitioner must choose the requirements, which describe the system in question. By asking
the user to rate the vignettes in an experimentally varied fashion, the RE practitioner gathers
data on the users attitude towards diferent specifications. This can help the RE practitioner for
choosing certain design options over others.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments References</title>
      <p>Thanks to Camille Salinesi, René Noel and Marius Lombard-Platet for their help and comments.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>I. Van de Poel</surname>
          </string-name>
          ,
          <article-title>Investigating ethical issues in engineering design</article-title>
          ,
          <source>Science and engineering ethics 7</source>
          (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Detweiler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Harbers</surname>
          </string-name>
          ,
          <article-title>Value stories: Putting human values into requirements engineering</article-title>
          .,
          <source>in: REFSQ Workshops</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Cemiloglu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Arden-Close</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hodge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Kostoulas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Catania</surname>
          </string-name>
          ,
          <article-title>Towards ethical requirements for addictive technology: The case of online gambling</article-title>
          ,
          <source>in: 2020 1st Workshop on Ethics in Requirements Engineering Research and Practice (REthics)</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Grimm</surname>
          </string-name>
          , Social desirability bias, Wiley international encyclopedia of marketing (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rashid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>May-Chahal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chitchyan</surname>
          </string-name>
          ,
          <article-title>Managing emergent ethical concerns for software engineering in society</article-title>
          ,
          <source>in: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Atzmüller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Steiner</surname>
          </string-name>
          ,
          <article-title>Experimental vignette studies in survey research</article-title>
          ,
          <source>Methodology: European Journal of Research Methods for The Behavioral and Social Sciences</source>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>K.</given-names>
            <surname>Auspurg</surname>
          </string-name>
          , T. Hinz, Factorial survey experiments, volume
          <volume>175</volume>
          ,
          <string-name>
            <surname>Sage</surname>
            <given-names>Publications</given-names>
          </string-name>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bhatia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Breaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Reidenberg</surname>
          </string-name>
          , T. B.
          <string-name>
            <surname>Norton</surname>
          </string-name>
          ,
          <article-title>A theory of vagueness and privacy risk perception</article-title>
          ,
          <source>in: 24th International Requirements Engineering Conference (RE)</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>H.</given-names>
            <surname>Hibshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Breaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Broomell</surname>
          </string-name>
          ,
          <article-title>Assessment of risk perception in security requirements composition</article-title>
          ,
          <source>in: 2015 IEEE 23rd International Requirements Engineering Conference (RE)</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Steiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Atzmüller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <article-title>Designing valid and reliable vignette experiments for survey research: A case study on the fair gender income gap</article-title>
          ,
          <source>Journal of Methods and Measurement in the Social Sciences</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Baguley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Dunham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Steer</surname>
          </string-name>
          ,
          <article-title>Statistical modeling of vignette data in psychology (</article-title>
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>