Using epistemic information to improve learning gains in a computer-supported collaborative learning context Max Dieckmann, Davinia Hernández-Leo Universitat Pompeu Fabra (UPF), Plaça de la Mercè, 10-12, 08002 Barcelona Abstract Computer-supported collaborative learning (CSCL) is a method in education where the students work together on a task while the teacher takes on the role of a coach who --- aided by information technology --- scaffolds their progress and allows them to discover a solution on their own. CSCL exercises are often run following a script, which breaks the activity in a set number of steps to facilitate productive collaboration. This makes it easier for the teacher to orchestrate the exercise --- controlling the flow of the activity and attending to the students' needs as they arise. Teacher-facing dashboards are often used to enable orchestration by providing information about and controls to manipulate the state of the activity. Our research is centered on analyzing whether teachers and students can benefit from visualizing epistemic information, i.e. learning analytics data derived from examining the content of students' input. We expect that giving teachers access to epistemic information will facilitate orchestration, reduce the cognitive load required to oversee a CSCL activity, and create the opportunity for teacher-led debriefing --- a technique used by educators to make students reflect on the activity they engaged in and thus help them get a deeper understanding of the content that was covered. We also expect that this will ultimately have a positive impact on students' learning gains. We will extend the dashboard of “PyramidApp” --- a software tool that implements the CSCL “Pyramid” script --- with epistemic information to test our hypothesis. Subsequently, we will analyze how our findings transfer to other CSCL scripts and tools. We thus hope to contribute to the existing knowledge of how learning analytics data can successfully be employed in a CSCL context. We will follow the design-based research method which emphasizes co- operation with teachers and aims to test and apply interventions in realistic scenarios. Keywords 1 Computer-supported collaborative learning, orchestration, teacher-led debriefing, epistemic information, design-based research 1. Introduction is particularly evident since the beginning of the Corona-crisis, as many institutions were forced to conduct at least part of their lessons online The idea of using computers in education [3]. While the actual impact of using dates back to the 1960s [1]. What was initially technology for education has been criticized, a fringe approach has become more and more the endeavor is still viewed as promising [4]. common and shows no signs of slowing down Another frequent criticism is that the results [2]. Using this technology for teaching and from the lab don't translate to the reality of the learning has great appeal for both educational classroom --- or that they never make it there in institutions and researchers. Subsequently, the the first place [5]. However, with further field of technology enhanced learning (TEL) development comes further progress: Many emerged and with it a plethora of studies. This Proceedings of the Doctoral Consortium of Sixteenth European Conference on Technology Enhanced Learning, September 20–21, 2021, Bolzano, Italy (online). EMAIL: max.dieckmann@upf.edu (M. Dieckmann); Davinia.hernandez-leo@upf.edu (D. Hernández-Leo) ORCID: 0000-0001-7128-8337 (M. Dieckmann); 0000-0003- 0548-7455 (D. Hernández-Leo) ©️ 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) researchers place an emphasis on developing the main benefits of using computer technology and testing their interventions in realistic in a CSCL context is that the scripted activity scenarios and are adding to the growing amount can be automated, reducing organizational of evidence that enhancing learning through overhead and in many cases making it possible technology is not only possible, but to implement an exercise that would not be worthwhile. possible otherwise. There are indications that Learning analytics is a fast-growing area of this is beneficial to students by increasing their TEL and is defined as “the measurement, motivation, shaping their expectations and collection, analysis and reporting of data about freeing up time to focus on the task. learners and their contexts, for purposes of While a CSCL script gives the task a clear understanding and optimizing learning and the structure --- with all the upsides that such a environments in which it occurs” [6]. guide brings ---, technology can help make its Typically, learning analytics data is implementation more flexible to its specific automatically collected and processed by context. This is described by the notion of machines. One benefit of this approach is that orchestration: The teacher needs to respond to large amounts of data can be handled and made the students' needs as they arise and adapt the use of --- potentially in real time. exercise to the current situation [13, 14]. Another relatively modern trend in Computer technology can provide the teacher education is collaborative learning [7]. This with data that they can use to better orchestrate means that the students will work together on a the activity or gain valuable information they task and try to find a solution, rather than being can use to prepare future lectures. This is often directly told how to get there. The role of the done in the form of a teacher-facing dashboard, teacher becomes that of a coach, who scaffolds where the teacher can control the state of the the students’ progress rather than giving them exercise. Common use cases are pausing the the correct answers / techniques outright. This activity to clear up misconceptions or motivate is also referred to as “guided participation”. non-participating students, skipping There are many forms of collaborative learning, unnecessary waiting time when moving on to but the most effective approaches seem to be the next stage, and identifying and scaffolding those that put a focus on intrinsic incentives struggling groups. (e.g. the student’s natural search for knowledge, There have been several implementations of competence, and stimulating communication) teacher-facing dashboards that visualize and frame the task in a way that emphasizes learning analytics data. Our focus will be on the collaboration rather than competition. The visualization of epistemic information derived positive effects of this method are most notable from analyzing the content of the students' when looking at conceptual insights that are inputs (answers, chat messages etc.). We expect acquired by the students --- something that is that visualizing synthesized epistemic notoriously difficult to teach. However, information can reduce teacher cognitive load collaborative learning is no more successful as it drastically reduces the amount of text a than direct instruction when teaching formulas, teacher has to read to follow the students' procedures, or the application of an existing progress. Additionally, we expect this to have a model. positive impact on orchestration by making it Computer supported collaborative learning easier to identify when and where to intervene, (CSCL) is the combination of collaborative as well as to facilitate teacher-led debriefing by learning and technology enhanced learning [2, highlighting the most relevant student 8, 9, 10]. It has the potential to solve some of contribution for further discussion. the problems that arise when implementing a In teacher-led debriefing lectures, students' collaborative learning task and has seen a lot of answers are put into perspective and addressed activity in the last decades. Unlike in direct in the light of new course content. Students are instruction, the teacher's attention is split required to justify their beliefs, receive among several groups, which will likely work at different paces and struggle at different times. In order to manage this demand, a CSCL activity will often be run following to a CSCL script which scaffolds [11] the students and provides a clear pattern to follow [12]. One of Figure 1: (Stage 3) Students collaborate in a group and agree on a collective answer. Figure 2: Part of the dashboard of the PyramidApp. The dashboard provides information and controls for orchestration to the teacher. feedback on their performance and thus get to Results were promising: Experts judged about structure their newly acquired knowledge 80% of the selected comments as viable, which before integrating it into a theoretical indicates that this approach could be useful in framework [13, 15]. Similar techniques have reducing the number of comments teachers already been successfully applied in have to consider when monitoring an activity simulation-based medical education, where it is and thus reducing cognitive load. considered to be an important component of the The approach to use NLP technology to learning experience [16, 17]. analyze students' artefacts and utterances for We are basing the assumptions on the learning analytics is not without precedence impact of our intervention in part on a study and there are several techniques that seem similar to our own, in which content analysis promising [19, 20, 21]. One such technique is data was added to a teacher-facing dashboard to the analysis of text to gain a measure on the support the CSCL activity EthicApp [18]. The level of confusion and precision in the students' data visualizations were derived using natural answers [22, 23]. Other studies showed the language processing (NLP) techniques on potential to investigate semantic similarity, student data, rank ordering comments by sentiment, and point-of-view --- going as far as relevance and comparing the work groups by being able to gauge the degree of collaboration how homogeneous their members opinions are. within a group that is working on a CSCL task We will initially focus on the “Pyramid” [21, 24, 25]. script and PyramidApp, but we are hoping to Ultimately, we expect that the effects of our extend the research by analyzing to what extent intervention will extend from the teachers to the the interventions that will be designed and students and have a positive impact on their evaluated are transferable to other CSCL scripts learning gains. such as “Jigsaw” or “ArgueGraph” [29, 30]. 2. Research context 3. Research questions An example of a CSCL script is the To sum up, the research questions that we “Pyramid” (sometimes referred to as want to answer are the following: “Snowball”), which is structured as follows [26]: 1. How can teacher-oriented The teacher will initially give a task to the dashboards with learning analytics students, usually to answer an open question. In (LA) indicators based on epistemic the first stage, the students will each information facilitate teacher-led individually think about and write down their debriefing in CSCL scripts? answer. In the second stage, they are presented 2. How can teacher-oriented with a selection of answers from their peers and dashboards with LA indicators rate these answers by what they think are the based on epistemic information most correct and complete. In the third stage, facilitate real-time orchestration in the collaboration truly begins, as the students CSCL scripts? are assigned to groups where they discuss the 3. Do teacher interventions informed previously rated answers and synthesize an by LA indicators related to answer for the group. Finally, the group epistemic information improve answers are rated by all students and thus the learning gains? class agrees on one final answer. Depending on the size of the class, stages 2 and 3 will be Section 1 covers the background and repeated with larger and larger groups, until a motivation of our questions, section 2 final consensus is reached. introduces a concrete implementation of a Another example of a CSCL script is the CSCL script that we will build upon to test our “Jigsaw”: First, students work on their own on questions, section 4 lays out the methodology one of several topics. Then, expert groups get we will use to attempt to answer our questions, formed by grouping the students by the topic and section 5 concludes with describing what that they worked on. In these groups, the students help each other understand their topic in depth and prepare to present it to non- experts. In the last phase, groups are formed heterogeneous by mixing students in a way that each group has at least one expert of each topic. They then take turns explaining what they are now proficient in to the non-experts until the whole group understands the entire range of topics. PyramidApp is a software that implements the “Pyramid” script, making it easy to integrate it into a classroom lesson or online course [27, 28]. Figure 1 shows the group stage of a “Pyramid” script in PyramidApp. PyramidApp also comes with a teacher-facing dashboard, which provides information about the state of the activity and gives the teacher controls for orchestration (see Figure 2) [14]. Figure 3: Overview of the design-based research method. we expect the impact answering our questions that they are not mutually exclusive and we will have. hope to be able to implement several of them simultaneously. 4. Methodology & methods Before moving to the second stage (development), we will need to identify which of these options are the most promising in terms Design-based research is a paradigm that of feasibility and impact. To achieve this, we aims to bring educational research back to will analyze existing PyramidApp data that we where it has the most impact [5, 31]. Instead of have access to. This data comes from previous separating the laboratory and the classroom, the applications of PyramidApp in real classroom researchers are collaborating with all scenarios. It consists of all inputs made in the stakeholders to make the research realistic and application, both from teachers (e.g. applicable. Interventions go through several interactions with the dashboard) and students design cycles, where the initial experiment will (e.g. answers and chat messages), as well as be refined and the results integrated into the metadata such as timestamps. Some of the underlying theory. students’ answers have also been rated by We are going to explore several approaches teachers, giving us additional information that to gather and present epistemic information in the PyramidApp dashboard and implement the interventions in practice. We will use existing data from previous experiments with PyramidApp to analyze the feasibility of the different presentation approaches and co- design prototypes in cooperation with the stakeholders. Following the design-based research methodology, the project will go through several cycles. Figure 3 shows the typical phases of each cycle (taken from [32]). 4.1. Analysis In the analysis stage, we conducted a literature review and identified that providing epistemic information to the teacher during a CSCL activity could lead to improved orchestration and debriefing. We then gathered several ideas for possible ways in which epistemic information could be gathered (see Table 1) and integrated (see Table 2) into the PyramidApp dashboard. It should be mentioned Table 1: Potential methods to collect epistemic data. Table 2: Potential methods to use epistemic data. Cell colors indicate whether the method has potential applications for orchestration (requires real-time display), teacher-led debriefing (displayed at the end of the activity) or both. could help to automatically identify the quality PyramidApp software to automatically log all of a student-submitted text. In some cases, we inputs of both students and teachers during the might also develop low-fidelity prototypes to activity (the data we analyzed in stage one was gauge the technical feasibility of our ideas. collected in the same way in the past). We will Finally, we will create mock-up visualizations also need to keep track of what was displayed and seek feedback from teachers. This in the dashboard at any time, ask experts to rate preliminary work should allow us to identify the students' answers, and have teachers and the most promising approaches and might lead students answer questionnaires. We will us to discard or add ideas. consider using a dual-task method to directly measure teacher cognitive load [35]. If 4.2. Development necessary, we will fix errors, improve the software and conduct additional tests until we have preliminary results. In the development stage, we will now be able to make an informed decision on which and how many of the visualizations we want to 4.4. Reflection implement and will begin by creating a low- fidelity, “proof-of-concept” prototype. We will This data will then be analyzed in the seek feedback from colleagues and teachers and reflection stage. We will attempt to integrate the improve it until we have a first version that is findings into our understanding of the sophisticated enough for a realistic test. underlying theory and identify where things went well and where there were problems. We 4.3. Testing will reflect on the impact that our intervention had by comparing it to the activities where teachers did not have access to epistemic We will then enter the testing stage, where information. We expect to see a positive impact we intent to conduct multiple within-subjects in the form of a measurable reduction in experiments running a PyramidApp activity cognitive load, increase in the ease of with and without epistemic information in a orchestration, facilitation of teacher-led realistic classroom or Massive Open Online debriefing, and student learning gains. Course (MOOC) setting. This is the phase When considering learning gains, it has to where we collect our data: we will use the be kept in mind that giving a correct answer Figure 4: Planned first design-based research cycle for this project. does not necessarily mean that one knows what CSCL scripts such as “Jigsaw” or they are doing, but measuring --- or even “ArgueGraph”. defining --- understanding is challenging [36]. The indirect influence of the research would We will focus on tangible expert scores for the be through the insights gained. The theory of time being, but might incorporate alternative the science of learning could be extended by measurements in the future. getting valuable information on the effects and We will then use all the insights that we've effectiveness of debriefing and orchestration in gained to begin the second design-based a CSCL context. Proving -- or disproving -- its research cycle. We will ask ourselves whether impact can inform the direction of further the data we gather and analyzed was sufficient research and lead to the development of to confirm or deny our expectations and answer successful interventions in the future. our research questions. We will consider what It should not be forgotten that even a would be necessary to extend our results to “negative” result would be significant, as it other CSCL scripts. Our considerations will let could suggest that a specific type of us decide whether we need to run additional intervention is inferior and the time of experiments, formulate new research questions, educators is better spent elsewhere. or further develop our epistemic data In this way, we hope to make a contribution visualizations. to the further improvement of educational Figure 4 summarizes how the first design- practice. based research cycle looks like for this project. 6. Acknowledgements 5. Conclusions Thanks to Ishari Amarasinghe for providing Following the design-based research me with ideas for ways in which epistemic philosophy, the ultimate goal of our research is information could be gathered and integrated the application of the findings in real teaching into the PyramidApp dashboard. situations in a way that improves learning gains and / or reduces the workload of the people involved. 7. References Our expected contribution is the development of visualizations of learning analytics data based on epistemic information [1] G. Paquette, Technology-based to reduce cognitive load, support orchestration, instructional design: Evolution and major and facilitate debriefing of CSCL scripts. We trends, Handbook of Research on expect that this will improve learning gains and Educational Communications and we will directly implement and validate it for Technology: Fourth Edition (2014) 661– the “Pyramid” script as well as critically 671. doi:10.1007/978-1-4614-3185-5_53. [2] H. Jeong, C. E. Hmelo-Silver, K. Jo, Ten examine and discuss its value for other types of years of computer-supported collaborative learning: A meta-analysis of cscl in stem Principles and Products (2009) 155–173. education during 2005–2014, Educational doi: 10.1007/978-1-4020-9827-7_10. Research Review 28 (2019) 100284. doi: [13] P. Dillenbourg, P. Jermann, Technology 10.1016/J.EDUREV.2019.100284. for classroom orchestration, New Science [3] T. Surma, P. A. Kirschner, Technology of Learning: Cognition, Computers and enhanced distance learning should not Collaboration in Education (2010) 525– forget how learning happens, Computers 552. doi: 10.1007/978-1-4419-5716-0_26. in Human Behavior 110 (2020) 106390. [14] I. Amarasinghe, D. Hernandez-Leo, K. doi: 10.1016/J.CHB.2020.106390. Michos, M. Vujovic, An actionable [4] A. Kirkwood, L. Price, Technology- orchestration dashboard to enhance enhanced learning and teaching in higher collaboration in the classroom, IEEE education: what is ‘enhanced’ and how do Transactions on Learning Technologies 13 we know? a critical literature review, (2020) 662–675. doi: Learning, Media and Technology 39 10.1109/TLT.2020.3028597. (2014) 6–36. [15] P. Dillenbourg, F. Hong, The mechanics of doi:10.1080/17439884.2013.770404. cscl macro scripts, International Journal of [5] F. Wang, M. J. Hannafin, Design-based ComputerSupported Collaborative research and technology-enhanced Learning 2007 3:1 3 (2008) 5–23. doi: learning environments, Educational 10.1007/S11412-007-9033-1. Technology Research and Development [16] J. W. Rudolph, R. Simon, D. B. Raemer, 2005 53:4 53 (2005) 5–23. doi: W. J. Eppich, Debriefing as formative 10.1007/BF02504682. assessment: Closing performance gaps in [6] R. Ferguson, Learning analytics: Drivers, medical education, Academic Emergency developments and challenges, Medicine 15 (2008) 1010–1016. doi: International Journal of Technology 10.1111/J.1553-2712.2008.00248.X. Enhanced Learning 4 (2012) 304–317. [17] T. Levett-Jones, S. Lapkin, A systematic doi: 10.1504/IJTEL.2012.051816. review of the effectiveness of simulation [7] W. Damon, E. Phelps, Critical distinctions debriefing in health professional among three approaches to peer education, education, Nurse Education Today 34 International Journal of Educational (2014) e58–e63. doi: Research 13 (1989) 9–19. doi: 10.1016/J.NEDT.2013.09.020. 10.1016/0883-0355(89)90013-X. [18] C. Alvarez, G. Zurita, A. Carvallo, P. [8] E. Lehtinen, K. Hakkarainen, L. Lipponen, Ramírez, E. Bravo, N. Baloian, Automatic M. Rahikainen, H. Muukkonen, Computer content analysis of student moral discourse supported collaborative learning: A in a collaborative learning activity, in: D. review, The JHGI Giesbers reports on Hernández-Leo, R. Hishiyama, G. Zurita, education 10 (1999) 1999. B. Weyers, A. Nolte, H. Ogata (Eds.), [9] P. Dillenbourg, S. Järvelä, F. Fischer, The Collaboration Technologies and Social evolution of research on computer- Computing, Springer International supported collaborative learning, Publishing, Cham, 2021, pp. 3–19. doi: Technology-Enhanced Learning: 10.1007/978-3-030-85071-5_1. Principles and Products (2009) 3–19. doi: [19] L. Albo, M. Beardsley, I. Amarasinghe, D. 10.1007/978-1-4020-9827-7_1. Hernandez-Leo, Individual versus [10] G. Stahl, T. Koschmann, D. Suthers, computersupported collaborative self- Computersupported collaborative explanations: How do their writing learning, The Cambridge Handbook of the analytics differ?, Institute of Electrical and Learning Sciences (2014) 479–500. doi: Electronics Engineers Inc., 2020, pp. 132– 10.1017/CBO9781139519526.029. 134. doi: [11] J. Hammond, P. Gibbons, What is 10.1109/ICALT49669.2020.00046. scaffolding, Teachers’ voices 8 (2005) 8– [20] C. P. Rose, Discourse analytics, Handbook 16. of Learning Analytics (2017) 105–114. [12] A. Weinberger, I. Kollar, Y. Dimitriadis, doi: 10.18608/HLA17.009. K. MäkitaloSiegl, F. Fischer, Computer- [21] D. S. McNamara, L. K. Allen, S. A. supported collaboration scripts, Crossley, M. Dascalu, C. A. Perret, Technology-Enhanced Learning: Natural language processing and learning analytics, Handbook of Learning Learning 2007 3:1 3 (2008) 5–23. doi: Analytics (2017) 93–104. doi: 10.1007/S11412-007-9033-1. 10.18608/HLA17.008. [30] P. Dillenbourg, Over-scripting cscl: The [22] I. Amarasinghe, D. Hernández-Leo, E. risks of blending collaborative learning Theophilou, J. Roberto Sánchez Reina, R. with instructional design, Three worlds of A. L. Quintero, Learning gains in pyramid CSCL. Can we support CSCL (2002) 61– computer-supported collaboration scripts: 91. URL: Factors and implications for design, in: D. https://citeseerx.ist.psu.edu/viewdoc/dow Hernández-Leo, R. Hishiyama, G. Zurita, nload?doi=10.1.1.457.9255&rep=rep1&t B. Weyers, A. Nolte, H. Ogata (Eds.), ype=pdf. Collaboration Technologies and Social [31] T. Anderson, J. Shattuck, Design-based Computing, Springer International research: A decade of progress in Publishing, Cham, 2021, pp. 35–50. doi: education research?, Educational 10.1007/978-3-030-85071-5_3. Researcher 41 (2012) 16–25. doi: [23] T. Atapattu, K. Falkner, M. Thilakaratne, 10.3102/0013189X11428813. L. Sivaneasharajah, R. Jayashanka, An [32] T. Amiel, T. C. Reeves, Design-based identification of learners’ confusion research and educational technology: through language and discourse analysis, Rethinking technology and the research 2019. URL: agenda, Journal of educational technology https://arxiv.org/abs/1903.03286v1.arXiv: & society 11 (2008) 29–40. 1903.03286. [33] J. Pujara, H. Miao, L. Getoor, W. Cohen, [24] J. J. Jiang, D. W. Conrath, Semantic Knowledge graph identification, Lecture similarity based on corpus statistics and Notes in Computer Science (including lexical taxonomy, CoRR cmp-lg/9709008 subseries Lecture Notes in Artificial (1997). URL: http://arxiv.org/abs/cmp- Intelligence and Lecture Notes in lg/9709008. Bioinformatics) 8218 LNCS (2013) 542– [25] W. Medhat, A. Hassan, H. Korashy, 557. doi: 10.1007/978-3-642-41335-3_34. Sentiment analysis algorithms and [34] J. Ramos, et al., Using tf-idf to determine applications: A survey, Ain Shams word relevance in document queries, in: Engineering Journal 5 (2014) 1093–1113. Proceedings of the first instructional doi: 10.1016/J.ASEJ.2014.04.011. conference on machine learning, volume [26] D. Hernández-Leo, E. D. Villasclaras- 242, Citeseer, 2003, pp. 29–48. Fernández, J. I. Asensio-Pérez, Y. [35] R. Brunken, J. L. Plass, D. Leutner, Direct Dimitriadis, I. M. Jorrín-Abellán, I. Ruiz- measurement of cognitive load in Requies, B. Rubia-Avi, Collage: A multimedia learning, Educational collaborative learning design editor based Psychologist 38 (2003) 53–61. doi: on patterns, Educational Technology & 10.1207/S15326985EP3801\_7. Society 9 (2006) 58. [36] R. S. Nickerson, Understanding [27] K. Manathunga, D. Hernández-Leo, understanding, American Journal of Authoring and enactment of mobile Education 93 (1985) 201–239. doi: pyramid-based collaborative learning 10.1086/443791. activities, British Journal of Educational Technology 49 (2018) 262–275. doi: 10.1111/BJET.12588. [28] K. Manathunga, D. Hernández-Leo, Pyramidapp: Scalable method enabling collaboration in the classroom, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9891 LNCS (2016) 422– 427. doi: 10.1007/978-3-319-45153-4_37. [29] P. Dillenbourg, F. Hong, The mechanics of cscl macro scripts, International Journal of ComputerSupported Collaborative