Participation and Interaction Patterns of MOOC Learners with Different Learning Achievements: A Collective Attention Network Perspective Ming Gao 1, Jingjing Zhang 1,2,3 1 Research Centre of Distance Education, Beijing Normal University, Beijing, China 2 Big Data Centre for Technology-mediated Education, Beijing Normal University, Beijing, China 3 Faculty of Education, Beijing Normal University, Beijing, China Abstract MOOCs offer a wide range of students an open and cost-free learning environment as a kind of online education. Understanding how students allocate their attention among a growing number of well-designed educational resources is important to making sense of this type of online and flexible learning. This study selected the ‘Introduction to Psychology’ offered on XuetangX as an example, and adopted an open-flow network approach to investigate how different learning achievement groups allocate their attention at a collective level. The results showed that the patterns of collective attention differed among different learning achievement groups. Learners who received excellent and good scores were more likely to attend to the course syllabus and learning progress, whilst learners who failed or did not participate in the exam were more likely to wander around. Moreover, the cost of collective attention while attending to different resources follows the pre-designed course structure, which illustrates that learners are largely manipulated by the pre-designed tree structure, which packages knowledge into separate categories. As 21st century skills demand learners to make the connection between knowledge, how to take full advantage of the open and flexible nature of the online course to facilitate more personalized and complex interaction between learners and content deserves severe attention. Keywords 1 MOOCs, collective attention, open-flow network, network science, learning analytics 1. Introduction Massive open online courses, known as MOOCs, are becoming increasingly popular among students looking for open educational opportunities online. When compared to traditional classrooms with a finite number of students, MOOCs offer a more open, flexible, and affordable way for a far larger number of individuals to learn [1]. Learners of MOOCs are not required to pay any fees if they do not wish to pursue the option of receiving a course certificate. As a result, MOOCs appear to be no cost, which is in keeping with the open and free philosophy that has long been promoted. Therefore, it is commonly argued that when students have such unfettered access to learning resources of a high quality, they will make effective use of those resources. However, when students are presented with a vast quantity of resources, the true cost is their scarce and limited attention [2,3]. In a similar vein, although more and more learning resources are being made available in MOOCs, students can only concentrate their attention on a certain number of resources at a time. When it comes to individuals, their attention can be distributed and channeled toward a variety of resources. Individual attention is dispersed, but it is concentrated and formed into collective attention [4], which Proceedings of the NetSciLA22 workshop, March 22, 2022 EMAIL: mgao519@126.com (A.1); jingjing.zhang@bnu.edu.cn (A.2) ORCID: 0000-0003-0129-703X (A.1); 0000-0002-0584-534X (A.2) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) CEUR ht tp: // ceur -ws .or g Workshop I SSN1613- 0073 Pr oceedings gives us an opportunity to understand how students allocate their attention to participate and interact in the massive information space on a collective level. From the perspective of cognition [5], educational psychologists have spent years trying to understand how learners concentrate on or transfer their attention to the instructional content. Unlike traditional classroom learning, learners need a good self-regulatory ability to select learning resources, allocate learning time, and set a learning pace in the online learning environment where learning resources are excessive [6]. In this case, learners’ attention has been a more macroscopic and external form of resource [7] and their academic attainments are related to the effective accumulation, circulation, and dissipation of collective attention flow [2]. "Attention economy" was introduced by Goldhaber, who defined attention as a scarce resource. He stated that attention may also be traded and flowed in different places in the online environment where we increasingly dwell, just as traditional resources can be traded and flowed to make profits in economics [8]. In this view, human attention is given the attribute of flow creatively. To put it another way, attention moves continuously between the various resources available on the web, both in and out. As a result, there is a cost, which could be very costly when people's attention flows in and out of resources, since there is only so much attention to go around. In this study, we take four rounds of one course on XuetangX as an example and build upon earlier studies by using the network model of collective attention to investigate the participation and interaction patterns of learners with different achievements at the collective level. Clickstream data from four rounds of ‘Introduction to Psychology’ (Autumn 2017, Spring 2018, Autumn 2018, Spring 2019) were used in the present study to address the following two questions: 1. What do the allocation patterns of collective attention look like among four learning achievement groups? 2. What is the cost of the collective attention flow in and out of courseware resources among different groups? 2. Methods 2.1. Context Introduction to Psychology on XuetangX was selected as a case to analyze the patterns of participation and interaction. As one of the top-quality open courses [9], it has launched several rounds since 2015. Referring to the traditional instructional regime, it also set two open accessing periods - spring and autumn semester - for the public each year. The course design adopted a traditional linear trajectory which consists of several units (see Figure 1.a). Each unit provides several video lectures, followed by an assignment for learners to self-evaluate. 70 videos, assignments or examination from 13 units form the main learning resources of the course (Figure 1.b shows the detailed number of the learning resources in each unit). Moreover, the course also provides other types of modules or resources, such as a course forum, syllabus, progress, etc. [10]. The overall course design almost remained in the different rounds. While learners participate in the course, their learning behavioral data will be automatically collected and stored in the database. In our previous studies, we found that there was a tendency for learners to access this course on mobile devices. Unfortunately, behavioral data created via mobile devices was not stored in a complete form. As a result, we only left the available data from participants who accessed the course via the web. Therefore, there is a sharp decline in the amount of data that can be analyzed in a single round of the course. Fortunately, there is a possible way to combine data from multiple rounds of the course to increase the amount of available data and analyze it as the overall course design has remained the same. In this study, four rounds (Autumn 2017 to Spring 2019) of data were selected. In these most recent four rounds of the course, a total of 21,182 learners accessed the course via the web and left 155, 191 behavioral records. When looking up the registration information and examination records, only 9,438 of these learners had complete records in the database. After cleaning the data set, 9,438 users who accessed the course via the web had complete records in the database, and at least 1 second of log data (105,160 logs) was included in the data analysis. According to the score they achieved in the final examination (the maximum is 100), we grouped these 9,438 learners into four cohorts. 79 participants whose scores were greater than or equal to 80 were labelled as excellent; 85 participants whose scores were between 60 and 80 were labelled as good; 1139 participants were labelled as failed as their final scores were lower than 60; and the remaining 8135 participants were labelled as absent because they did not take the exam. Referring to their reported demographic profiles, 936 learners were female, 1,232 were male, and 7270 did not declare their gender. About 80% of those who told about their education had at least a bachelor's degree. Figure 1: Description of the Course 2.2. Modelling an open-flow network of collective attention Based on the learners’ clickstream data and our earlier research [2, 10, 11, 12], open-flow networks of collective attention were created to model different cohorts’ behaviors. Learning sessions were used to establish a collective attention network due to the fact that individual learners do not acquire all of the learning units at the same predetermined period. As a rule of thumb, online activities that happened more than 25.5 minutes apart were treated as separate sessions [13]. In this study, a new session was defined as a sequence that had passed 30 minutes. Figure 2 shows the network model of collective attention and its flow between online and offline space. The nodes in this network model represent learning resources, and the links indicate the learners’ sequential visits to these resources. At the collective level, the large body of consecutive visits by learners reflects the flux of attention that flows into and out of the learning resources. The network was formed by the flux of such an attention flow. The offline space was represented by the two added artificial nodes— ‘source’ and ‘sink’. The route from the "source" to the other nodes depicts the point at which people get online; when people go offline from one resource, this is marked by a link from the nodes to "sink" (i.e., ‘sink’ marks the end of a learning session). By adding these two artificial nodes, this network was rebuilt as an open and balanced model, which allows collective attention to flow in and out across online and offline spaces (more details are given in [2]). For individual learning resources, the inflow of attention equals the outflow of attention. In this study, the total amount of attention flowing into each learning resource was taken into consideration. Figure 2: The accumulation, circulation, and dissipation of attention flow (from [2], p. 287) Flow distance, an innovative metric proposed by Guo [14], was used to measure the tendency of attention flow between learning resources. It was defined as the average first-arrival distance estimated by an N-order Markov transition between nodes. The flow distance from node 1 to node 2 is equal to the sum of the probabilities of learners from node 1 visiting node 2 for the first time after k steps, where k spans from 1 to infinity, according to the infinite order calculation of Markov transfer. The idea depicts the average number of steps necessary for attention flow to circulate from one learning resource to another because flow distance estimates the average first-arrival distance between two nodes of a learning resource. A small flow distance from node A to node B indicates a greater tendency of attention flow direction from node A to node B.The value of flow distance can also be used to measure the cost of collective attention. For example, if the value of the flow distance between the two contents is large, it means only a small part of learners’ attention will flow according to the current order or sequence. This indicates a high cost of collective attention flow between these two contents. Namely, learners usually do not allocate their attention to the next piece of content after completing the previous one. On the contrary, if the value of the flow distance is small, it shows that most of the learners’ attention flows follow the current order and indicates that the pre-designed learning sequence has a great effect on guiding the flow of collective attention. 3. Results 3.1. What do the allocation patterns of collective attention look like among four learning achievement groups? Based on the learners' clickstream data, four open-flow networks of collective attention were built and are visualized in Figure 3. The different colors are used to represent different types of learning resources, e.g., courseware, syllabus, discussion, and others. All collective attention networks have the same number of courseware resources (video lectures or quizzes, the total number is 70), course bulletin (Info, the number is 1), introduction of the course (about resource, the number is 1), syllabus resource (the number is 1) and learning progress resource (the number is 1). There are some differences in the number of forum resources (discussion) and other resources (other) between the four groups. The node size represents the amount of flow into and out of the collective attention. Figure 3 Open-flow networks of collective attention of four groups The metrics of the four collective attention networks are shown in Table 1. There are more nodes in the networks of groups of learners who performed excellent or did not participate in the final exam. Moreover, the network densities of these two groups were smaller than the other two groups. That is, these two groups’ learners are more likely to attend to more learning resources than the other two groups, who were used to following pre-designed instruction and whose attention flowed to more learning resources except the course pre-designed normal learning. For the average degree, a one-way analysis of variance (ANOVA) test revealed a significant discrepancy between the four groups (F(3, 720) = 3.697, p = 0.012 < 0.05). The further Turkey multiple comparison test (Tukey’s HSD) showed that the failed group’s average degree is significantly larger than the excellent group’s (means differ = 7.624, p = 0.025 < 0.05, CI = [0.664, 14.585]), while there were no significant differences between other groups. It indicated that failing learners’ attention was more likely to circulate more learning resources than excellent learners'. However, the diameter of the excellent group was larger than the other groups. It may indicate that learners in the excellent group tend to form longer learning paths than those in other groups. It is interesting to find that the group of learners who did not pass the exam not only has the largest value of node, edge, and network density, but also has a smaller average path length. It means that learners in this group frequently attend to all kinds of resources. One possible explanation is that they try to build the connection between resources or find the answer in any possible way. Table 1 Attributions of four open-flow networks of collective attention Group Num. of Num. of Average Diameter Average Network nodes edges degree path density length Excellent 202 1627 8.054 14 3.320 0.040 Good 150 1229 8.193 6 2.675 0.055 Failed 165 1958 11.867 8 2.370 0.072 Absent 207 2200 10.628 6 2.473 0.052 The amount of collective attention flowed into each learning resource was also calculated and differences between the groups of learners who performed extremely well or good in the exam and the groups of learners who did not pass or participate in the exam were found. The largest amount of attention flows into the course bulletin (Info), introduction of the course (About), syllabus resource (Syllabus), learning progress resource (Progress), and several courseware resources (i.e., 1.08, 13.01, 1.01, 3.05) for the groups of learners who performed extremely well or good in the exam. For the groups of learners who did not pass or participate in the exam, the top 10 learning resources with the largest attention flow are the course bulletin (Info), introduction of the course (About) and several courseware resources (i.e., 1.01, 1.02, 1.03, 1.04, 1.05, 2.01). It could imply that the two former groups, excellent and good, are more concerned with the instructional plan and their learning progress than with directing their attention flow in the course learning. 3.2. What is the cost of the collective attention flow in and out of courseware resources among different groups? In MOOCs, especially in xMOOCs, the main course contents (i.e., courseware provided by the course instructors, such as lectures, quizzes, and examinations) are usually organized as a tree structure, with chapters and units. This categorized structure is an important piece of guidance designed by instructional designers and subject experts. It aims to help learners follow a logical path to learn knowledge as time goes on. However, in an online setting, while accommodating self- regulated learning in a more open and flexible manner, whether learners still strictly follow the illustrated course structure, or whether they should follow the course design, as this group is not homogeneous, are questions that need to be answered. To look into these issues, we aligned the learning resources with the unit structure and used the flow distance as a way to figure out how much attention the course structure took from everyone. Based on the tree structure of learning resources, learners’ ideal learning path can be simplified as follows: (1) learners enter the online course learning space; (2) learners complete the content of all the units in the given order of the course; (3) learners leave the online course learning space. That is, the learning process can be described as the following order: source -> 1.01 -> 1.02 -> … -> 2.01 -> …- > 3.01 -> … -> 13.01 -> sink. The flow distance between two items shows the tendency of collective attention to flow. If the flow distance between the two contents is long, it means only a small part of the learners’ attention will flow according to the current linear order. This indicates a high cost of collective attention flow between these two contents. Namely, learners usually do not allocate their attention to the next piece of content after completing the previous one. On the contrary, if the flow distance is short, it shows that most of the learners’ attention flows follow the linear order and indicates that the pre-designed learning sequence has a great effect on guiding the flow of collective attention. Figure 4 shows the results of the cost of collective attention for different groups and their average level. The x-axis shows the sequential learning order in which the first content/node is the first lecture of Unit 1(i.e., 1.01) and the last content/node is sink, which indicates the offline space. Besides, in the background, two colors (red and gray) are interleaved to form separate units. The y-axis shows the flow distance between two resources. For example, the value of the first node is the flow distance of collective attention from source to 1.01, and the value of the last node is the flow distance of collective attention from 13.01 to sink. Note that if no collective attention flows between two nodes, the flow distance is null. We give it the maximum flow distance to represent the flowing cost of collective attention between these two nodes in order to make it quantifiable and visualized. As shown in Figure 4, it can be found that four groups present a similar pattern of collective attention cost. First, the flow distances between the last resource of the previous unit and the first resource of the latter unit (i.e., the node located at the intersection of two different color backgrounds) are usually larger, while the flow distances between resources within a unit are smaller. It shows that the flowing cost of collective attention within a unit is small, whilst the flowing cost of collective attention between different units is higher. It indicates that the linear tree structure of the contents manipulated the learners' learning trajectories to some extent. However, there were certain differences in the cohesion of content between the different units, because learners cost more collective attention when they followed the pre-designed linear learning sequence between units. Second, the flow distances between two adjacent pieces of content (i.e., lectures or quizzes) within a unit often decrease first and then rise. It indicated that the contents within a unit were more related and that it was in line with the learners’ cognitive development. Therefore, they usually learned this content by following the pre-designed linear structure. Third, we also found that within some units, such as unit 9 and unit 12, the flow distances were more fluctuant. It may indicate that learners are not always following the pre-designed linear structure to learn within a unit. Few self-explores have happened. Figure 4 Flowing cost of collective attention of different groups when they followed the pre- designed linear learning sequence 4. Discussions and Conclusion This study attempted to improve our understanding of online learning in an open and flexible learning environment from a network science perspective. To learn about the participation and interaction patterns, we divided learners into four different groups according to their performance in the final examination and created four open-flow networks of collective attention. To examine the differences in allocation and cost patterns of attention, we compare the network structural properties, the amount of collective attention, and the flow distance of different types of learning resources. In the top 10 learning resources with the highest traffic of collective attention, we found that About, Info, and several courseware resources are shown in all groups. However, the syllabus and progress resources are only shown in the top 10 learning resources in the groups of learners who perform extremely well and good in the exam. These two types of learning resources (i.e., syllabus and progress) are more related to the progress of an instructional plan and their learning. It may indicate that learners’ final performance depends on their attention to the progress of the course and timely learning. This finding is consistent with previous literature [15,16], which verifies that it is important for learners to study regularly and timely in online courses. Although MOOCs provide students with an open and flexible online learning environment, their attention has become a scarce resource, and they have decided what we will achieve in the face of abundant learning resources [2]. In this study, we also analyzed the cost of collective attention of different groups, especially the attention that was allocated to the instructors’ provided courseware resources. We measured the flowing tendency of collective attention between the courseware resources and found that it costs less when collective attention flows within a unit, but more when collective attention flows between different units. Researchers argue that the online learning space gives learners enough freedom so that learners can be free to jump between different learning contents and they will more actively engage in non-linear navigation [17]. However, the flowing cost of collective attention between and within units revealed that learners still prefer to learn following a pre-designed course structure, which is in line with what the previous study argued [18,19]. Therefore, how to take full advantage of the open and flexible nature of online courses to facilitate more personalized and complex learning interactions to prepare students to make the connection between knowledge may be a key issue in online course design. References [1] B.D. Voss, “Massive Open Online Courses (MOOCs): A Primer for University and College Board Members,” 2013. AGB Association of Governing Boards of Universities and Colleges. Review. https://agb.org/wp-content/uploads/2019/01/report_2013_MOOCs.pdf [2] J. Zhang et al., “Modeling collective attention in online and flexible learning environments,” Distance Educ. vol. 40, no. 2, pp. 278–301, 2019. [3] H. Simon, “Designing organizations for an information-rich world,” Comput. Commun. Public Interes., vol. 72, pp. 38–72, 1971. [4] F. Wu and B. A. Huberman, “Novelty and Collective Attention,” Proc. Natl. Acad. Sci., vol. 104, no. 45, pp. 17599–17601, 2007. [5] D. Kahneman, Attention and effort. Englewood Cliffs: PRENTICE-HALL, INC., 1973. [6] C. Milligan and A. Littlejohn, “Supporting Professional Learning in a Massive Open Online Course,” Int. Rev. Res. Open Distrib. Learn., vol. 15, no. 5, pp. 197–213, 2014. [7] M. B. Crawford, The world beyond your head: On becoming an individual in an age of distraction. Farrar, Straus and Giroux, 2015. [8] M. H. Goldhaber, “The attention economy and the net,” First Monday, vol. 2, no. 4, 1997. [9] Ministry of Education of the People’s Republic of China, “Ministry of Education launches 490 national selected online open courses,” 2018. http://en.moe.gov.cn/news/press_releases/201801/t20180119_325124.html [10] J. Zhang, M. Gao, and J. Zhang, “The learning behaviours of dropouts in MOOCs: A collective attention network,” Comput. Educ., vol. 167, p. 104189, 2021. [11] S. Zeng, J. Zhang, M. Gao, K. M. Xu, and J. Zhang, “Using learning analytics to understand collective attention in language MOOCs,” Comput. Assist. Lang. Learn., 2020. https://doi.org/10.1080/09588221.2020.1825094 [12] J. Zhang, Y. Huang, and M. Gao, “Video Features, Engagement, and Patterns of Collective Attention Allocation: An Open Flow Network Perspective,” J. Learn. Anal., vol. 9, no. 1, pp. 32– 52, 2022. [13] L. D. Catledge and J. E. Pitkow, “Characterizing Browsing Strategies in the World-Wide Web,” in The Third International WWW Conference, 1995, pp. 1–9. [14] L. Guo, X. Lou, P. Shi, J. Wang, X. Huang, and J. Zhang, “Flow distances on open flow networks,” Phys. A Stat. Mech. its Appl., vol. 437, pp. 235–248, 2015. [15] J. W. You, “Examining the Effect of Academic Procrastination on Achievement Using LMS Data in E-Learning,” Educ. Technol. Soc., vol. 18, no. 3, pp. 64–74, 2015. [16] C. J. Asarta and J. R. Schmidt, “Access Patterns of Online Materials in a Blended Course,” Decis. Sci. J. Innov. Educ., vol. 11, no. 1, pp. 107–123, 2013. [17] C. H. M. Lee, F. Sudweeks, and Y. W. Cheng, “A longitudinal study on the effect of hypermedia on learning dimensions, culture and teaching evaluations,” in Proceeding Cultural Attitudes Towards Technology and Communication, 2012, pp. 146–162. [18] N. Ford and S. Y. Chen, “Matching/mismatching revisited: An empirical study of learning and teaching styles,” Br. J. Educ. Technol., vol. 32, no. 1, pp. 5–22, 2001. [19] C. H. M. Lee, F. Sudweeks, Y. W. Cheng, and F. E. Tang, “The Role of Unit Evaluation, Learning and Culture Dimensions Related To Student Cognitive Style in Hypermedia Learning,” in Proceedings Cultural Attitudes Towards Communication and Technology, 2010, pp. 400–419.