=Paper= {{Paper |id=Vol-2469/ERDemo06 |storemode=property |title=Toward Creating a General Ontology for Research Validity |pdfUrl=https://ceur-ws.org/Vol-2469/ERDemo06.pdf |volume=Vol-2469 |authors=Roman Lukyanenko,Kai R. Larsen,Jeffrey Parsons,David Gefen,Roland M. Mueller |dblpUrl=https://dblp.org/rec/conf/er/LukyanenkoLPGM19 }} ==Toward Creating a General Ontology for Research Validity== https://ceur-ws.org/Vol-2469/ERDemo06.pdf
             Toward Creating a General Ontology for
                      Research Validity

        Roman Lukyanenko1, Kai R. Larsen2, Jeffrey Parsons3, David Gefen4,
                           and Roland M. Mueller5
                      1
                    HEC Montréal, Montréal, QC, H3T 2A7 Canada
                     2
                        University of Colorado, Boulder, CO, USA
       3
         Memorial University of Newfoundland, St. John’s. NL, A1B 3X5, Canada
                      4
                        Drexel University, Philadelphia, PA, USA
               5
                 Berlin School of Economics and Law, Berlin, Germany
         roman.lukyanenko@hec.ca, kai.larsen@colorado.edu,
jeffreyp@mun.ca, gefend@drexel.edu, roland.mueller@hwr-berlin.de



       Abstract. Validity is among the most foundational and widely used concepts in
       science. Much has been written on the subject, yet, we continue to lack estab-
       lished definitions of research validities. This paper presents preliminary results
       for developing a general ontology of research validity. In this paper, we assem-
       bled the largest data set of validities and used it in conjunction with a general
       ontology to develop an ontology for the 11 core validities used in psychometric
       behavioral studies. We evaluated the ontology with the panel of experts. Our next
       step is to broaden the ontology to other validities in our dataset. A rigorous on-
       tology of validity promises to improve our understanding of the nature of validi-
       ties and can be used to develop more precise and consistent definitions for va-
       lidities.

       Keywords: Validity, Research Quality, Categorization, Systems Analysis and
       Design, Conceptual modeling, Ontology, Ontology Engineering


1      Introduction

Much has been written on what it means for social science research to be valid. Broadly,
validity deals with the quality of scientific research and dependability of scientific find-
ings. Over the years, researchers proposed hundreds of specific kinds of validities (e.g.,
internal, ecological, discriminant), but we continue to lack agreement on these con-
cepts. To illustrate, Table 1 presents examples of different ways internal validity has
been conceptualized. Such inconsistencies can result in difficulty in evaluating research
claims, hinder integration of findings and inhibit progress in the social sciences.
   We propose that concepts and techniques from conceptual modeling and ontology
development [1, 2] can be used to develop consistent and precise definitions of different
forms of validity in social sciences research and, thereby, for the first time organize the
large and messy domain of research validities.
___________________
Copyright © 2019 for this paper by its authors. Use permitted under Creative Com-
mons License Attribution 4.0 International (CC BY 4.0).
134


2      Toward A General Ontology for Research Validity

To develop the ontology, we used existing literature as input. Specifically, we searched
through English-language published books, journal articles, and conference articles
(through Google Books and Google Scholar). The resulting data set, we referred to as
ValDS, contains over 400 distinct concepts and thousands of definitions for them and
constitutes, to our knowledge, the largest collection of validities.
   An analysis of this dataset reveals the messiness of the domain. As Table 1 shows,
there are numerous instances of different concepts bearing the same names, and similar
concepts appearing with different names in different studies. We also observed the pre-
dominance of imprecise language in conceptualizations of validities. For example, an
internal validity definition (e.g., [3]) may specify that the test is performed in the con-
text of an “empirical investigation”, making it unclear whether internal validity applies
exclusively to experiments (as commonly assumed, see e.g., [4]), or other types of stud-
ies (e.g., econometric or survey). Finally, there is no standardized way to describe the
phenomena which validities seek to represent (e.g., two validities each looking to en-
gage subject matter experts in an activity may refer to them as “judges” or “raters”).

                    Table 1. Some definitions and senses of internal validity.
 Definition
 If empirically observed patterns coincide with the pattern predicted, the case study findings have
 greater internal validity [5]
 When examining the internal validity of the constructs, represented by the loadings to their re-
 spective construct, one ensures that the items measuring one construct are indeed measuring
 the construct they were designed for [6]
 Internal validity is the degree to which an experiment is able to demonstrate a causal relation-
 ships between two variables [4]
 Internal validity is the extent to which the conclusions of an empirical investigation are true within
 the limits of the research methods and subjects or participants used [3]


    In developing the ontology, we went through multiple phases (in this paper, we focus
on the first phase). To construct the ontology, we used both top-down and bottom-up
approaches. First, to ensure our interpretations are consistent and grounded in gener-
ally-accepted notions and principles of social sciences, we adopted a seminal social
science ontology by Bunge [7] as the guiding reference for our work.
    The bottom-up approach was based on the analysis of the validities from the ValDS
above. A key question was how to select the validities to start the process. Having a
data set of potentially all validities posed a challenge as the popularity of different va-
lidities varied (e.g., consider the popular internal validity vs the niche cash validity).
Although we were inclusive in constructing the ValDS (focusing on English-language
literature), some could challenge certain validities as being applicable to social science
research. Furthermore, it is generally debatable whether it is possible to provide an un-
controversial unified ontology for both qualitative/interpretive and positivist/quantita-
tive research [8]. We thus chose to focus on positivist/quantitative validities first and
develop an ontology representing this area as an initial goal.
                                                                                         135


   To ensure that we begin with the validities widely recognized as essential, we used
the guidelines from The American Psychological Association, The National Council on
Measurement in Education and The American Educational Research Association which
proposed 11 validities (e.g., face, internal, external, ecological) deemed core for typical
psychometric studies in social sciences. However, we also considered the other validi-
ties among the 400+ for better interpretation of the 11 core validities.
   We extracted concepts from the 11 definitions of the research validities. For exam-
ple, given the third definition [4] from Table 1, we extracted “the degree to”, “experi-
ment”, “is able to demonstrate” “causal relationships”, “two variables”. We then ex-
amined each concept to discover synonyms (e.g., “the degree to” and “the extent to”),
being careful to consider the context and typical uses of the validities to deem some-
thing as synonymous. We also began to model the relationships among concepts (e.g.,
in the example above, that two variables are causally related). Next, we identified enti-
ties (e.g., variable, study, experiment), its attributes (e.g., location of a study), and the
relationships among entities (e.g., a “variable” is generated within a “study”). For all
key decisions, we consulted Bunge. For example, as “experiment” was not the only
method for establishing causation between two variables [7], in the final step, when
modeling internal validity, we decided to generalize the entity “experiment” to “in-
quiry”, a broader entity which covers the different types of causation-focused studies
conducted in positivist/quantitative research.




           Fig. 1. The preliminary Phase 1 (“core 11”) ontology shown using UML.

   Figure 1 shows the resulting preliminary ontology. The ontology expresses the same
concepts present in the original validities definitions, only using standardized language,
and precisely identifies the relationships among the concepts. With the development of
an ontology, we can have a single, consistent language to describe phenomena of inter-
est in the domain of validity. Among other things, this permitted us to redefine existing
validity definitions. For example, given a definition of predictive criterion validity: "in-
volves establishing how well the measure predicts future behaviors you'd expect it to
be associated with" [9], we can now express it in the terms of the ontology (see Figure
136


1) as: Predictive criterion validity (PCV1) is the extent to which the values of varia-
ble(s) manipulated in an inquiry is(are) similar to the values of criterion variable(s)
obtained in the future. The acronym PCV1 is also added to distinguish this sense of
predictive criterion validity from others (i.e., disambiguate the definitions to avoid pol-
ysemy shown in Table 1). Indeed, with the use of ontology, we could take the definition
one step further, and express predictive criterion validity as a formula:
      Predictive criterion validity (PCV1) = SIM(variable.manipulated.value, variable.criterion.value);
          subject to: variable.manipulated.context.dateTime < variable.criterion.context.dateTime;
where SIM() is a similarity function which compares the distance between the values
of the manipulated and criterion variables. We foresee a future possibility of interpret-
ing the output of the SIM() function for a given study (inquiry) by comparing it with
the quality norms automatically mined from a given discipline.
   To evaluate Phase 1, we have conducted a preliminary study and are planning to
continue the evaluations, as per [2]. For the initial evaluation we were interested in
whether experts in modeling and ontology building were able to test the extent to which
the ontology in Phase 1 (which, naturally, used slightly different language from those
in the original definitions), was a faithful representation of the concepts conveyed by
the original definitions of validities, and, ultimately, the definitions themselves.
   We conducted the evaluation among the members of the 2019 AIS special interest
group on systems analysis and design (SIGSAND) - a special interest group of the As-
sociation for Information Systems that includes experts in conceptual modeling and
ontology development that are also generally familiar with validities in social sciences.
The materials comprised of a survey showing randomly selected definitions of validi-
ties and a question whether a given validity fit into the ontology.
   Participation was voluntary, and we received 9 completed packages. Among the 18
responses, 15 responses affirmed the fit. Each of the 3 negative response contained
details about why a particular expert believed that a given definition was not well rep-
resented in the ontology. The analysis of the negative responses revealed that, rather
than the lack of representation by the ontology being the source of the negative re-
sponse, the issue was the way in which the participant interpreted the definition (or the
concepts in the ontology). With the preliminary validation of the ontology, we now plan
to conduct a larger and more focused evaluation.


3        Conclusions and Future Work

The early results of our work suggest that the messy, inconsistent and ambiguous lan-
guage replete in the domain of research validities can be systematized using top-down
(guided by a general philosophical ontology) as well as bottom-up (by mining the set
of existing validities – the largest one to date which we accumulated for this project)
procedures. The ontology of core validities we developed in Phase 1 appears to ade-
quately represent those validities deemed essential for typical psychometric studies in
the social sciences. The panel of experts validated the representational fidelity of the
resulting ontology. Importantly, this shows that the procedure we employed can be ex-
tended to the 400 validities to construct a comprehensive unified ontology of validities.
                                                                                                 137


In the next phase – when we develop a more stable ontology version, we wish to revisit
the scope and purpose of the ontology [1].
   Once such an ontology is developed, it can be used as a basis to rewrite existing
definitions (in the manner shown in the example above). Ultimately, we hope this re-
sults in a shared reference for researchers and the community. With a common and
standardized language, one can increase certainty that a particular validity test per-
formed in one paper is the same type of test as performed in other research. This can
aid in the synthesis of scientific research and support building cumulative research tra-
ditions within disciplines. Having such an ontology would also provide for the first time
a coherent single structure to organize over 400 distinct validities.
   Likewise, the ontology of validities, as well as the formalized definition of each,
might be used in the development of tools that support conducting research in the social
sciences. For example, one can use the ontology to automatically mine existing publi-
cations and identify trends in using various kinds of validities. The ontology of validi-
ties can also power future semantic search engines. Finally, the project stands to extend
concepts and methods from the conceptual modeling and ontology engineering com-
munity to social science validities, which is consistent with the on-going extension of
conceptual modeling beyond traditional database design [10].


References

1. Fernandes, P.C.B., Guizzardi, R.S., Guizzardi, G.: Using goal modeling to capture compe-
    tency questions in ontology-based systems. Journal of Information and Data Management. 2,
    527 (2011).
2. McDaniel, M., Storey, V.C.: Evaluating Domain Ontologies: Clarification, Classification,
    and Challenges. ACM Computing Surveys. 53, 1–40 (2019).
3. Colman, A.M.: A Dictionary of Psychology. Oxford University Press, Oxford, UK (2015).
4. Dacko, S.: The Advanced Dictionary of Marketing. Oxford University Press, Oxford, UK
    (2008).
5. Gable, G.G.: Integrating case study and survey research methods: an example in information
    systems. European journal of information systems. 3, 112–126 (1994).
6. Goel, L., Johnson, N., Junglas, I., Ives, B.: Predicting users’ return to virtual worlds: a social
    perspective. Information Systems Journal. 23, 35–63 (2013).
7. Bunge, M.: Finding philosophy in social science. Yale University Press, New Haven, CT
    (1996).
8. Denzin, N.K., Lincoln, Y.S.: The Sage handbook of qualitative research. Sage, Thousand
    Oaks, CA (2005).
9. Adler, E.S., Clark, R.: An invitation to social research: How it’s done. Nelson Education,
    New York NY (2014).
10. Jabbari, M.A., Lukyanenko, R., Recker, J., Samuel, B.M., Castellanos, A.: Conceptual Mod-
    eling Research: Revisiting and Updating Wand and Weber’s 2002 Research Agenda. In: AIS
    SIGSAND. pp. 1–12. , Syracuse, NY (2018).