=Paper=
{{Paper
|id=Vol-3051/PA_2
|storemode=property
|title=LOGANShiny: An app for illustrating process data analysis from international large-scale assessments (Short Paper)
|pdfUrl=https://ceur-ws.org/Vol-3051/PA_2.pdf
|volume=Vol-3051
|authors=Denise Reis Costa
|dblpUrl=https://dblp.org/rec/conf/edm/Costa21
}}
==LOGANShiny: An app for illustrating process data analysis from international large-scale assessments (Short Paper)==
LOGANShiny: An app for illustrating process data analysis from international large-scale assessments Denise Reis Costa Centre for Educational Measurement University of Oslo, Norway d.r.costa@cemo.uio.no ABSTRACT Figure 1 presents the LOGAN package architecture with examples This paper describes a Shiny application for the R package of functions related to each step of the analysis of process data. LOGAN, LOGANShiny. This app was built to provide First, users import their log file data into R. For PISA 2012 log researchers and education stakeholders an overview of basic tools files, the package has a specific function to import the semi- for starting their analysis of process data from international large- processed SPSS files that are freely available at the OECD scale assessments. Using the log file data from one item displayed website (https://www.oecd.org/pisa/pisaproducts/database- at the PISA 2012 creative problem-solving assessment, the app is cbapisa2012.htm). In the future, there is also an intention to divided in three modules: (a) Data Preparation, (b) Response support the data management of raw log file data (e.g., xml files). times, and (c) Respondent’s actions. In each module, the user can After data import, one can use LOGAN functions to manage and interact with the app by analyzing students’ performance on the clean the data, extract information such as the total time on the item or comparing specific groups of students (e.g., gender or tasks or specific respondent’s strategy. cross-country analyses). The exploration of such tools can not only illustrate the potential and limitation of process data analysis from these assessments but can also advance one’s understanding of how students from 44 countries and economies interact with a problem-solving item on an international survey. Keywords Computer-based assessment; Log data; Digital items; R package. 1. INTRODUCTION International large-scale assessments have received widespread attention by measuring key cognitive skills and gathering information and data on how individuals use their knowledge in . different contexts. For example, since 2012, two important assessments conducted by the Organisation for Economic Co- Figure 1. LOGAN package (version 1.0.0) architecture. operation and Development (OECD), the Programme of To demonstrate the functionalities of the LOGAN package for use International Student Assessment (PISA) and the Programme for by researchers and education stakeholders interested in process International Assessment of Adult Competencies (PIAAC), not data analysis, a web-based application using the Shiny app [1] only started the administration of computer-based formats for a was created, the LOGANShiny app. Hosted at the large number of participating countries but also made a number of https://loganpackage.shinyapps.io/shiny/ page, this interactive items with respondent’s log file information publicly available. platform brings to the users examples of analysis for one released These log data contain a record of the interactions between the PISA 2012 creative problem-solving item, the Climate Control respondents and the computer testing application during the (CP025Q01). assessment. To answer this item, students were first presented a stimulus Process data from these kinds of data (e.g., response times and (Figure 2) where they needed to manipulate input variables (top, respondent’s actions) are of potential relevance to researchers and central, and bottom controls/sliders) to understand how an air can provide a better understanding of a range of issues related to conditioner changes the temperature and humidity of a room. test-taking behavior (e.g., engagement [3], navigation behavior Then, students had to draw arrows on a diagram that represent the [5]). Despite these potentialities, research on this field is still not relationship between the three controls and the two outputs well developed due to the challenges and obstacles associated (temperature and humidity). Full credit was given to students who with the management of such data [4]. correctly completed this diagram (i.e., top control impacts To overcome this difficulty, an open-source R package was temperature and central and bottom controls on humidity). developed: LOG file ANalysis in international large-scale The available log file data from this item captured the student’s assessments (LOGAN [10]). This package is intended to present a time on the task, and their exploration on applying and resetting set of user-facing functions, and the user does not need to be the input variables using the sliders, the associated temperature knowledgeable of the details of the underlying code or extensively and humidity values, and the state of the diagram at each work on the data management to conduct specific analysis of the exploration. There was no restriction on the number of times a log files from these assessments. student could manipulate these features, and they did not change Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). by themselves without the student's interaction. Example of Less than four seconds after starting the item, this student clicked studies using process data from this item are: [2], [4], [6], [7], [9], on the “RESET” button. In this scenario, all the input variables and [12]. were set as 0 (indicated by a triangle in Figure 2), the output indicated “25”, and no arrows were drawn on the diagram as the default. About 46 seconds after resetting the task, the same student moved all sliders in one position to the right (i.e., top_setting= “1”, central_setting= “1”, and bottom_setting= “1”) and clicked on “APPLY”. In this case, the temperature value automatically changed to “27” and the humidity to “28”. Again, the status of the diagram was still in its initial setting. When the student interacted with the diagram, the information displayed in the “diag_state” variable was represented as a binary number (e..g, “'000001“) with each digit associated with one input and one output variable (e.g., top control and temperature). After looking at the information that one can extract from the available log file data, analytical tools are presented at the LOGANShiny app. For example, a summary of the total number of event actions (including "START_ITEM" and "END_ITEM") Figure 2. Stimulus information from the problem-solving can be performed in the app. Figure 3 illustrates the log events Climate Control unit and the CP025Q01 item (Reprinted from from 1,015 students from Bulgaria (code=”BGR”). The same [4] with permission from Elsevier). analysis can also be done for data from other countries. The following sections of this paper intend to showcase the features of the LOGANShiny app regarding data management and statistical analysis from process data for this item. 2. DATA PREPARATION (MODULE 0) In tab “Module 0” from LOGANShiny app, a user will be presented with the particularities of the Climate Control item and its related log-file data. An interactive table displays all the 13 variables and 951,481 entries existing in the data. It represents how students from 44 countries and economies interacted with this item. The description of each variable and the three log events of one student from the United Arab Emirates (code = “ARE”) is Figure 3. Interactive summary table of the number of event illustrated in Table 1. actions (including "START_ITEM" and "END_ITEM") for Table 1. Log data variables description with three log events students from the “BGR” country. from student ID = “04852” from the “ARE” country. From the provided summary statistics, one can verify issues Log events (example) related to the OECD log data. For example, a student from this Variable Description country has only one entry in the log file data (i.e., one event 1 2 3 cnt Country code ARE ARE ARE action). Since it is expected to have at least two events for this schoolid School ID 0000189 0000189 0000189 dataset ("START_ITEM" and "END_ITEM"), the app emphasizes StIDStd Student ID 04852 04852 04852 the importance of a closer look at the data and acknowledges event Event status START_ITEM ACER_EVENT ACER_EVENT researcher’s freedom to review/filter/delete such inconsistencies. time Event time (in seconds) 1288.1 1291.9 1338.4 event_number Even sequence number 1 2 3 After data management, two analytical tools are provided in the event_type Event type NULL reset apply subsequent tabs: Time (Module 1) and Actions - Cognitive related top_setting Slider position: top NULL 0 1 (Module 2). central_setting Slider position: central NULL 0 1 bottom_setting Slider position: bottom NULL 0 1 temp_value Temperature value NULL 25 27 3. RESPONSE TIMES (MODULE 1) humid_value Humidity value NULL 25 28 On this tab, the amount of time students spent on the Climate diag_state Diagram status NULL NULL NULL control item is analyzed. First, a user should decide if the analysis The first log event for the student ID = “04852” indicates when of the total time will be conducted by item performance the student was exposed to the item for the first time (event (CP025Q01=0: incorrect answer; CP025Q01=1: correct answer) status= “START_ITEM”). For this case, the registered time was or gender (ST04Q01=1: female; ST04Q01=2: male). Later, the 1288.1 seconds since the beginning of the assessment, and no user can choose if the analyses will consider all countries or select interactions with the item features were recorded (e.g., a specific country, as illustrated in Figure 4. top_setting= “NULL”). 3.2 Time density plot Figure 6 illustrates the distribution of time by the performance on the task obtained with the LOGANShiny app. One could also plot the distribution of time by gender. Figure 4. Type of analytical tools presented at “Module 1”. After these choices, two types of descriptive statistics are provided: a summary table and a density plot. 3.1 Summary table of response times Figure 5 displays the information that one can gather from the LOGAN package for the analysis of the overall time and by item performance. A total of 30,345 students from all PISA 2012 participating countries and economies was analyzed for the Figure 6. Distribution of the total time students from all PISA Climate control item. The maximum amount of time spent on this 2012 participating countries and economies spent on the item. item was 26 minutes from a student who got an incorrect answer. For the group of students who got a correct answer, the maximum was 16 minutes. 4. RESPONDENT’S ACTION (MODULE 2) To illustrate how to explore the actions recorded in the log files, In general, students spent an average of 2 minutes on the task. LOGANShiny describes two respondent’s action strategies However, this estimate is not precise since negative response discussed at [4] based on the vary-one-thing-at-at-time (VOTAT) times were observed in this sample (i.e., the minimum amount of strategy. In the case of the Climate control item, the VOTAT time for those who got an incorrect answer was equal to -0.43). strategy consists of a student varying one specific variable (i.e., Although one could remove such cases from the dataset as they put the top control on "++"), while keeping all other variables lower the average values, it is displayed in the LOGANShiny to constant (i.e., put the central and bottom controls on the delta reveal another inconsistency in this log-file dataset. symbol), and clicking on "apply". To operationalize the VOTAT strategy, [4]’s authors suggest: Even though this type of discrepancy could possibly be detected at the data preparation stage of analysis, it was left to the “Module (a) VOTAT 1: a dichotomous variable with “1” to students who 1: Time” tab for the LOGANShiny to highlight again the applied VOTAT for all input variables; and importance of further inspection of process data and proper data manipulation of the files from large-scale assessments. (b) VOTAT 2: incorporated four categories for no isolated variation at all (category 0), isolated variation of one input variable (for example, only the top control), isolated variation of two input variables (for example, the top and bottom controls), and isolated variation of all three input variables (category 3). One must note that VOTAT 2 category 3 is the same as the VOTAT 1 = “1”. Category 0, on the other hand, indicates the case where the student did not vary any slider or vary all the sliders at a time before clicking the “apply” button. To illustrate how to derive these VOTAT variables from the log data, the third log event from Table 1 shows the case where the student selected top_setting= “1”, central_setting= “1”, and bottom_setting= “1” before clicking on apply. In this scenario, both VOTAT 1 and VOTAT 2 would receive the value “0” where no isolated variation on all the controls were found. Based on these categories, one can investigate how performance outcomes and VOTAT strategies are related by country, item level performance (CP025Q01=0: incorrect answer; CP025Q01=1: correct answer), and problem-solving overall performance (first plausible value, PV1CPRO). To do this on LOGANShiny, one Figure 5. Interactive summary table of the overall time on the should select the type of VOTAT strategy they are interested in, task (in minutes) and by item performance for students from followed by each participating country the analyses will be related all participating countries and economies from PISA 2012 to (Figure 7): problem-solving assessment. Imputation methods are used in PISA to generate plausible values to report students’ overall performance [8]. In a scale with a mean score among OECD countries of 500, five plausible values were defined for the PISA 2012 creative problem-solving assessment. In the LOGANShiny, an analysis using one plausible values is illustrated in Figure 9. Figure 7. Type of analytical tools presented at “Module 2”. After these choices, two types of descriptive and correlational statistics are provided: a summary report and a frequency plot. 4.1 Summary report of student’s strategies and performance On LOGANShiny, it is possible to conduct a statistical summary of students’ exploration via the “VOTAT 1” strategy and its relationship with performance. This analysis is presented as a report divided in three parts: (1) frequency table, (2) measures of association between strategy and item performance, and (3) summary of test performance (considering the first plausible value from the PISA 2012 problem-solving assessment) by VOTAT strategy. Figures 8 and 9 show an example of this report. From Figure 8, it is possible to see that about half of the students from this sample applied the VOTAT 1 strategy at least one time. For the group of students who got a correct answer in the CP025Q01 item, the majority (12,404 out of 15,076 students) Figure 9. Second part of the interactive summary report with applied this strategy at least once during the item evaluation. the analysis of student’s strategy “VOTAT 1” and overall Correlational measures (i.e., chi-square statistic and phi performance (first plausible value, “PV1CPRO”) of all coefficient) are also provided to evaluate the strength of the students from the PISA 2012 problem-solving assessment. association between these variables. Based on the provided statistics, it is possible to note that students who used the VOTAT 1 strategy on the Climate Control item received, on average, more than 100 score points on the PISA 2012 creative problem-solving assessment in contrast to those who did not use this strategy. 4.2 Frequency Plot In PISA, student’s scores in the assessments are also divided into proficiency scale levels to provide a substantive meaning of the overall performance. For PISA 2012 creative problem-solving assessment, seven levels of proficiency were created where level 1 (358 < PV1CPRO ≤ 423) corresponds to an elementary level of problem-solving skills and level 6 (PV1CPRO >= 683) the highest level. A complete description of these levels is presented in Figure V.2.2 from the OECD report [8]. In LOGANShiny, these proficiency levels are plotted in relation to the use of the VOTAT strategy. Figure 10 illustrates this Figure 8. First part of the interactive summary report with the relationship. Here, percentages within the categorized proficiency analysis of student’s strategy “VOTAT 1” and item score are provided in parenthesis for each PISA proficiency level. performance of all students from the PISA 2012 problem- Findings from this analysis indicate that students on the high level solving assessment. of the scale tend to use “VOTAT 1” more than those on the lower levels of the PISA 2012 creative problem-solving proficiency scale. [3] Goldhammer, F., Martens, T., & Lüdtke, O. (2017). Conditioning factors of test-taking engagement in PIAAC: an exploratory IRT modelling approach considering person and item characteristics. Large-Scale Assessments in Education, 5:18. https://doi.org/10.1186/s40536-017-0051-9 [4] Greiff, S., Wüstenberg, S., & Avvisati, F. (2015). Computer- generated log-file analyses as a window into students’ minds? A showcase study based on the PISA 2012 assessment of problem solving. Computers and Education, 91, 92–105. https://doi.org/10.1016/j.compedu.2015.10.018 [5] Hahnel, C., Goldhammer, F., Naumann, J., & Kröhne, U. (2016). Effects of linear reading, basic computer skills, evaluating online information, and navigation on reading digital text. Computers in Human Behavior, 55, 486–500. https://doi.org/10.1016/j.chb.2015.09.042 Figure 10. Frequency of students by “VOTAT 1” strategy and [6] Han, Z., He, Q., & von Davier, M. (2019). Predictive Feature PISA proficiency levels for students from all participating Generation and Selection from Process Data in PISA countries and economies in the PISA 2012 problem-solving Simulation-Based Environment: An Implementation of Tree- assessment. based Ensemble Methods. Frontiers in Psychology, 10, 2461. https://doi.org/10.3389/fpsyg.2019.02461 5. CONCLUSION [7] He, Q., & von Davier, M. (2016). Analyzing Process Data from Problem-Solving Items with N-Grams. In Handbook of Research on Technology Tools for Real-World Skill In this paper, LOGANShiny is presented as an illustrative tool for Development (pp. 750–777). https://doi.org/10.4018/978-1- showcasing the functionalities of the LOGAN R package 4666-9441-5.ch029 functions for the analysis of process data from international large- [8] OECD. (2014). PISA 2012 Results: Creative Problem scale assessments. Interactive tables and graphical displays Solving (Volume V): Vol. V. OECD Publishing. intended to shed light on the potentialities and limitations of the https://doi.org/https://doi.org/10.1787/9789264208070-en use of log-file data regarding data management and analysis of response times and student’s actions. This app can be a valuable [9] Pejic, A., & Molcer, P. S. (2016). Exploring data mining tool to deepen researchers’ and education stakeholder’s possibilities on computer based problem solving data. SISY knowledge on the item features and provide insights on students’ 2016 - IEEE 14th International Symposium on Intelligent cognitive process. The understanding of how process data can be Systems and Informatics, Proceedings, 171–176. extracted and analyzed may not only inspire the development of https://doi.org/10.1109/SISY.2016.7601491 new item features that could enrich one’s experience with digital [10] Reis Costa, D., & Leoncio, W. (2019). LOGAN: An R environments, but also has the potential to improve the package for log file analysis in international large-scale assessment’s results by, for instance, incorporating process data assessments. R Package. https://cran.r- into the scoring procedure [11]. project.org/web/packages/LOGAN/index.html [11] Reis Costa, D., Bolsinova, M., Tijmstra, J., & Andersson, B. (2021). Improving the Precision of Ability Estimates Using 6. REFERENCES Time-On-Task Variables: Insights From the PISA 2012 [1] Chang, W., Cheng, J., Allaire, J., Xie, Y., & McPherson, J. Computer-Based Assessment of Mathematics. Frontiers in (2020). shiny: Web Application Framework for R. R package Psychology, 12. https://doi.org/10.3389/fpsyg.2021.579128 version 1.4.0.2. https://CRAN.R-project.org/package=shiny [12] Xu, H., Fang, G., Chen, Y., Liu, J., & Ying, Z. (2018). [2] Chen, Y., Li, X., Liu, J., & Ying, Z. (2019). Statistical Latent Class Analysis of Recurrent Events in Problem- Analysis of Complex Problem-Solving Process Data: An Solving Items. Applied Psychological Measurement, 42(6), Event History Analysis Approach. Frontiers in Psychology, 476–498. https://doi.org/10.1177/0146621617748325 10, 486. https://doi.org/10.3389/FPSYG.2019.00486