=Paper=
{{Paper
|id=Vol-2469/ERDemo06
|storemode=property
|title=Toward Creating a General Ontology for Research Validity
|pdfUrl=https://ceur-ws.org/Vol-2469/ERDemo06.pdf
|volume=Vol-2469
|authors=Roman Lukyanenko,Kai R. Larsen,Jeffrey Parsons,David Gefen,Roland M. Mueller
|dblpUrl=https://dblp.org/rec/conf/er/LukyanenkoLPGM19
}}
==Toward Creating a General Ontology for Research Validity==
Toward Creating a General Ontology for Research Validity Roman Lukyanenko1, Kai R. Larsen2, Jeffrey Parsons3, David Gefen4, and Roland M. Mueller5 1 HEC Montréal, Montréal, QC, H3T 2A7 Canada 2 University of Colorado, Boulder, CO, USA 3 Memorial University of Newfoundland, St. John’s. NL, A1B 3X5, Canada 4 Drexel University, Philadelphia, PA, USA 5 Berlin School of Economics and Law, Berlin, Germany roman.lukyanenko@hec.ca, kai.larsen@colorado.edu, jeffreyp@mun.ca, gefend@drexel.edu, roland.mueller@hwr-berlin.de Abstract. Validity is among the most foundational and widely used concepts in science. Much has been written on the subject, yet, we continue to lack estab- lished definitions of research validities. This paper presents preliminary results for developing a general ontology of research validity. In this paper, we assem- bled the largest data set of validities and used it in conjunction with a general ontology to develop an ontology for the 11 core validities used in psychometric behavioral studies. We evaluated the ontology with the panel of experts. Our next step is to broaden the ontology to other validities in our dataset. A rigorous on- tology of validity promises to improve our understanding of the nature of validi- ties and can be used to develop more precise and consistent definitions for va- lidities. Keywords: Validity, Research Quality, Categorization, Systems Analysis and Design, Conceptual modeling, Ontology, Ontology Engineering 1 Introduction Much has been written on what it means for social science research to be valid. Broadly, validity deals with the quality of scientific research and dependability of scientific find- ings. Over the years, researchers proposed hundreds of specific kinds of validities (e.g., internal, ecological, discriminant), but we continue to lack agreement on these con- cepts. To illustrate, Table 1 presents examples of different ways internal validity has been conceptualized. Such inconsistencies can result in difficulty in evaluating research claims, hinder integration of findings and inhibit progress in the social sciences. We propose that concepts and techniques from conceptual modeling and ontology development [1, 2] can be used to develop consistent and precise definitions of different forms of validity in social sciences research and, thereby, for the first time organize the large and messy domain of research validities. ___________________ Copyright © 2019 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). 134 2 Toward A General Ontology for Research Validity To develop the ontology, we used existing literature as input. Specifically, we searched through English-language published books, journal articles, and conference articles (through Google Books and Google Scholar). The resulting data set, we referred to as ValDS, contains over 400 distinct concepts and thousands of definitions for them and constitutes, to our knowledge, the largest collection of validities. An analysis of this dataset reveals the messiness of the domain. As Table 1 shows, there are numerous instances of different concepts bearing the same names, and similar concepts appearing with different names in different studies. We also observed the pre- dominance of imprecise language in conceptualizations of validities. For example, an internal validity definition (e.g., [3]) may specify that the test is performed in the con- text of an “empirical investigation”, making it unclear whether internal validity applies exclusively to experiments (as commonly assumed, see e.g., [4]), or other types of stud- ies (e.g., econometric or survey). Finally, there is no standardized way to describe the phenomena which validities seek to represent (e.g., two validities each looking to en- gage subject matter experts in an activity may refer to them as “judges” or “raters”). Table 1. Some definitions and senses of internal validity. Definition If empirically observed patterns coincide with the pattern predicted, the case study findings have greater internal validity [5] When examining the internal validity of the constructs, represented by the loadings to their re- spective construct, one ensures that the items measuring one construct are indeed measuring the construct they were designed for [6] Internal validity is the degree to which an experiment is able to demonstrate a causal relation- ships between two variables [4] Internal validity is the extent to which the conclusions of an empirical investigation are true within the limits of the research methods and subjects or participants used [3] In developing the ontology, we went through multiple phases (in this paper, we focus on the first phase). To construct the ontology, we used both top-down and bottom-up approaches. First, to ensure our interpretations are consistent and grounded in gener- ally-accepted notions and principles of social sciences, we adopted a seminal social science ontology by Bunge [7] as the guiding reference for our work. The bottom-up approach was based on the analysis of the validities from the ValDS above. A key question was how to select the validities to start the process. Having a data set of potentially all validities posed a challenge as the popularity of different va- lidities varied (e.g., consider the popular internal validity vs the niche cash validity). Although we were inclusive in constructing the ValDS (focusing on English-language literature), some could challenge certain validities as being applicable to social science research. Furthermore, it is generally debatable whether it is possible to provide an un- controversial unified ontology for both qualitative/interpretive and positivist/quantita- tive research [8]. We thus chose to focus on positivist/quantitative validities first and develop an ontology representing this area as an initial goal. 135 To ensure that we begin with the validities widely recognized as essential, we used the guidelines from The American Psychological Association, The National Council on Measurement in Education and The American Educational Research Association which proposed 11 validities (e.g., face, internal, external, ecological) deemed core for typical psychometric studies in social sciences. However, we also considered the other validi- ties among the 400+ for better interpretation of the 11 core validities. We extracted concepts from the 11 definitions of the research validities. For exam- ple, given the third definition [4] from Table 1, we extracted “the degree to”, “experi- ment”, “is able to demonstrate” “causal relationships”, “two variables”. We then ex- amined each concept to discover synonyms (e.g., “the degree to” and “the extent to”), being careful to consider the context and typical uses of the validities to deem some- thing as synonymous. We also began to model the relationships among concepts (e.g., in the example above, that two variables are causally related). Next, we identified enti- ties (e.g., variable, study, experiment), its attributes (e.g., location of a study), and the relationships among entities (e.g., a “variable” is generated within a “study”). For all key decisions, we consulted Bunge. For example, as “experiment” was not the only method for establishing causation between two variables [7], in the final step, when modeling internal validity, we decided to generalize the entity “experiment” to “in- quiry”, a broader entity which covers the different types of causation-focused studies conducted in positivist/quantitative research. Fig. 1. The preliminary Phase 1 (“core 11”) ontology shown using UML. Figure 1 shows the resulting preliminary ontology. The ontology expresses the same concepts present in the original validities definitions, only using standardized language, and precisely identifies the relationships among the concepts. With the development of an ontology, we can have a single, consistent language to describe phenomena of inter- est in the domain of validity. Among other things, this permitted us to redefine existing validity definitions. For example, given a definition of predictive criterion validity: "in- volves establishing how well the measure predicts future behaviors you'd expect it to be associated with" [9], we can now express it in the terms of the ontology (see Figure 136 1) as: Predictive criterion validity (PCV1) is the extent to which the values of varia- ble(s) manipulated in an inquiry is(are) similar to the values of criterion variable(s) obtained in the future. The acronym PCV1 is also added to distinguish this sense of predictive criterion validity from others (i.e., disambiguate the definitions to avoid pol- ysemy shown in Table 1). Indeed, with the use of ontology, we could take the definition one step further, and express predictive criterion validity as a formula: Predictive criterion validity (PCV1) = SIM(variable.manipulated.value, variable.criterion.value); subject to: variable.manipulated.context.dateTime < variable.criterion.context.dateTime; where SIM() is a similarity function which compares the distance between the values of the manipulated and criterion variables. We foresee a future possibility of interpret- ing the output of the SIM() function for a given study (inquiry) by comparing it with the quality norms automatically mined from a given discipline. To evaluate Phase 1, we have conducted a preliminary study and are planning to continue the evaluations, as per [2]. For the initial evaluation we were interested in whether experts in modeling and ontology building were able to test the extent to which the ontology in Phase 1 (which, naturally, used slightly different language from those in the original definitions), was a faithful representation of the concepts conveyed by the original definitions of validities, and, ultimately, the definitions themselves. We conducted the evaluation among the members of the 2019 AIS special interest group on systems analysis and design (SIGSAND) - a special interest group of the As- sociation for Information Systems that includes experts in conceptual modeling and ontology development that are also generally familiar with validities in social sciences. The materials comprised of a survey showing randomly selected definitions of validi- ties and a question whether a given validity fit into the ontology. Participation was voluntary, and we received 9 completed packages. Among the 18 responses, 15 responses affirmed the fit. Each of the 3 negative response contained details about why a particular expert believed that a given definition was not well rep- resented in the ontology. The analysis of the negative responses revealed that, rather than the lack of representation by the ontology being the source of the negative re- sponse, the issue was the way in which the participant interpreted the definition (or the concepts in the ontology). With the preliminary validation of the ontology, we now plan to conduct a larger and more focused evaluation. 3 Conclusions and Future Work The early results of our work suggest that the messy, inconsistent and ambiguous lan- guage replete in the domain of research validities can be systematized using top-down (guided by a general philosophical ontology) as well as bottom-up (by mining the set of existing validities – the largest one to date which we accumulated for this project) procedures. The ontology of core validities we developed in Phase 1 appears to ade- quately represent those validities deemed essential for typical psychometric studies in the social sciences. The panel of experts validated the representational fidelity of the resulting ontology. Importantly, this shows that the procedure we employed can be ex- tended to the 400 validities to construct a comprehensive unified ontology of validities. 137 In the next phase – when we develop a more stable ontology version, we wish to revisit the scope and purpose of the ontology [1]. Once such an ontology is developed, it can be used as a basis to rewrite existing definitions (in the manner shown in the example above). Ultimately, we hope this re- sults in a shared reference for researchers and the community. With a common and standardized language, one can increase certainty that a particular validity test per- formed in one paper is the same type of test as performed in other research. This can aid in the synthesis of scientific research and support building cumulative research tra- ditions within disciplines. Having such an ontology would also provide for the first time a coherent single structure to organize over 400 distinct validities. Likewise, the ontology of validities, as well as the formalized definition of each, might be used in the development of tools that support conducting research in the social sciences. For example, one can use the ontology to automatically mine existing publi- cations and identify trends in using various kinds of validities. The ontology of validi- ties can also power future semantic search engines. Finally, the project stands to extend concepts and methods from the conceptual modeling and ontology engineering com- munity to social science validities, which is consistent with the on-going extension of conceptual modeling beyond traditional database design [10]. References 1. Fernandes, P.C.B., Guizzardi, R.S., Guizzardi, G.: Using goal modeling to capture compe- tency questions in ontology-based systems. Journal of Information and Data Management. 2, 527 (2011). 2. McDaniel, M., Storey, V.C.: Evaluating Domain Ontologies: Clarification, Classification, and Challenges. ACM Computing Surveys. 53, 1–40 (2019). 3. Colman, A.M.: A Dictionary of Psychology. Oxford University Press, Oxford, UK (2015). 4. Dacko, S.: The Advanced Dictionary of Marketing. Oxford University Press, Oxford, UK (2008). 5. Gable, G.G.: Integrating case study and survey research methods: an example in information systems. European journal of information systems. 3, 112–126 (1994). 6. Goel, L., Johnson, N., Junglas, I., Ives, B.: Predicting users’ return to virtual worlds: a social perspective. Information Systems Journal. 23, 35–63 (2013). 7. Bunge, M.: Finding philosophy in social science. Yale University Press, New Haven, CT (1996). 8. Denzin, N.K., Lincoln, Y.S.: The Sage handbook of qualitative research. Sage, Thousand Oaks, CA (2005). 9. Adler, E.S., Clark, R.: An invitation to social research: How it’s done. Nelson Education, New York NY (2014). 10. Jabbari, M.A., Lukyanenko, R., Recker, J., Samuel, B.M., Castellanos, A.: Conceptual Mod- eling Research: Revisiting and Updating Wand and Weber’s 2002 Research Agenda. In: AIS SIGSAND. pp. 1–12. , Syracuse, NY (2018).