Interactions of reading and assessment activities Niels Seidel1[0000−0003−1209−5038] and Dennis Menze1[0000−0003−0002−868X] FernUniversität in Hagen, Universitätsstraße 1, 58084 Hagen, Germany niels.seidel@fernuni-hagen.de dennis.menze@fernuni-hagen.de Abstract. Reading and assessment are elementary activities for knowl- edge acquisition in online learning. Assessments represented as quizzes can help learners to identify gaps in their knowledge and understanding, which they can then overcome by reading the corresponding text-based course material. Reversely, quizzes can be used to evaluate reading com- prehension. The predominantly self-regulated interaction of reading and quiz activities in learning systems used in higher education has been little studied. In this paper, we examine this interaction using scroll and log data from an online undergraduate course (N=142). By analyzing pro- cesses and sequential patterns in user sessions, we identified six session clusters for characteristic reading and quiz patterns potentially relevant for adaptive learning support. These clusters showed that individual user sessions included either mainly reading or quizzes, but rarely both. Keywords: reading analytics · learning analytics · assessment · sequen- tial pattern mining 1 Introduction The acquisition of knowledge through reading and quizzes is a fundamental activity in online learning. Quizzes as formative assessments can help learners to identify gaps in their knowledge and understanding, which they can then over- come by reading the corresponding text-based learning material, e.g. eBooks or digital textbooks. The other way around, quizzes can be used to evaluate reading comprehension. Other than in video-based learning (e.g. MOOCs [6]) or many printed textbooks, in common online courses, the quizzes are separated from text-based knowledge acquisition. This separation is probably caused by the modular design of learning systems providing e.g. HTML pages and test environments. However, teachers invest a lot of effort to provide useful quizzes that correspond to the provided text-based learning materials. The interaction of reading and quiz activities under such conditions, in which students are free to combine activities and personalize learning paths over the semester, has hardly been investigated so far. We are using a Learning Analytics approach to ana- lyze these individual interactions between reading and quiz activities. With this Copyright © 2022 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 2 N. Seidel & D. Menze approach, we aim to get further insights on how to adaptively support learners performing these activities at a time and over the course of a semester. Regarding quizzes, events like attempt start, submission of solutions, and retries, among others are collected in the logs of an online learning environ- ment, e.g. a Learning Management System. Similarly, reading activity can be indicated through scrolling events or page turns since eye tracking is not feasi- ble in practice. To identify behaviors from these log events, sequences of events within individual user sessions have to be considered. Frequently occurring event sequences are referred to as sequential patterns. With regard to sequential pat- terns, we want to answer the research question (RQ1): What sequential patterns can be identified in reading and quiz activities? Using proven sequential pat- tern mining algorithms, we are analyzing reading and quiz activities within the individual user sessions. From these patterns, we identified clusters of frequent learning behaviors. From this analysis, we expect insights about situations that may require an adaptive learning support. The remainder of the article is structured as follows. In section 2, we refer to related works. The applied methods are described in section 3, before we present the results and discussion in section 4 and 5. The article ends with a summary and outlook in section 6. 2 Related Works The precise analysis of reading behavior is not very widespread in the field of learning analytics. Using eBooks, [2] distinguished sequential and responsive reading behavior in a study (N=90) in order to determine engagement per page, content, and student. Reading engagement correlated with final grade while the proposed reading styles did not. The authors did not consider assessment and changes in the reading style over time. [14] formed three groups from a cohort of 160 graduate students considering their reading motivation and reading du- ration with regard to four course texts. The learning behavior was coded into six sequential patterns: intensive reading, multi-tasking reading, skim-reading, passing a course unit test, not completing a test, and being inactive. Changes between these behaviors revealed differences between the three groups. After performing a test, the two groups with low reading duration bypassed reading material in favor of another test. Test performances did not correspond to the reading behavior. However, reading was roughly measured by page turns instead of paragraph views. [19] modeled scrolling activities to predict the revisitation of short text sections to define implicit bookmarks. In comparison to the afore- mentioned works, scroll data was precisely recorded and analyzed over time. The relationship between reading and quiz activities in higher education has only been examined in a few studies so far. [4] analyzed respective relations using an interactive textbook. About 700 computer science students made use of mandatory quizzes and optional reading tasks. The majority of students directly performed the quizzes without reading any text. Those participants who were reading did it shortly before the due date of their homework and could not spend Interactions of reading and assessment activities 3 much time on reading. In an experimental design (N=36/38), [18] compared reading comprehension using an eBook system with and without generated cloze items. Compared to the control group, the experimental group benefited from quiz items that effectively promoted their reading skills, reading engagement, and reading comprehension. The undergraduate participants could improve their reading comprehension through repeated tests. In a large study (N=13,362) of 412 courses in China, [3] used transfer state diagrams to describe changes between interactions with the course material and among peers and the instructor. Longer interaction sequences have not been considered. Although the changes have been compared between three periods of one months, students have not been grouped based on behavioral similarities. In summary, the interaction of reading and assessment has not been suf- ficiently explored. Especially reading analytics could be improved by precise scrolling measures. Methods like process mining and sequential pattern min- ing in combination with clustering have not been used to consider relations of reading and assessment. 3 Methods 3.1 Data Participants and design: The study was conducted in the compulsory course “Operating Systems and Computer Networks” of a distance learning B.Sc. Com- puter Science study program (CS course) in the winter semester 2020/2021. For the enrolled students, a supplementary course was set up in a Moodle learning environment. The use of the learning environment was voluntary, but conditional on a two-step consent to use the platform and to participate in the study. As an incentive for students’ participation, additional learning opportunities such as self-assessments, assignments, semester planning support, and interactive course texts have been provided. These differences in the learning offers are comparable to different didactic offers of tutors in face-to-face teaching. Students not partic- ipating in the study had no disadvantages regarding the examination since the course texts provided to all enrolled students form the basis of the examination. 180 of the 534 CS course participants agreed to take part in the study and to use the Moodle environment. By the end of the semester, the same number of active participants had been recorded, but only 142 of them used the quizzes or texts offered in the course. The participating students were between 19 and 65 years old (M=37.21, SD=9.03). 128 were male and 52 female. Material: The Moodle course contained four units including course texts, a newsgroup forum, recordings of live sessions, 30 assignments corrected by a tutor, and questions for exam preparation. For this study, 42 self-assessment questions [5] and 23 multiple-choice questions were provided, both referred as quizzes. Textual learning resources in the form of study letters are a core element of distance learning at the FernUniversität in Hagen. For the proposed field study, 4 N. Seidel & D. Menze we have converted the existing course units from Word/LATEX format to HTML. The HTML could be used in the Moodle activity plugin called Longpage. This plugin was developed by the authors to support reading of comprehensive and interactive course materials of the size of 40 to 80 standard pages. In a reader- friendly design, basic instruments have been provided to facilitate navigation and orientation. For instance, a table of contents, individual bookmarks, cross- references, and a full-text search was included. Additional cross-references to semantically related courses were realized on the basis of text corpora [10] using a recommender system [12]. Course texts and quizzes are not embedded together on the same page, but separate, although there are some cross-links: texts contain several hyperlinks to corresponding quizzes, and the feedback to a submitted quiz can include hyperlinks to the corresponding text sections. The use of these links was not evaluated in this study. As shown in Fig. 1, individual reading progress was graphically emphasized at the margins of text sections. Fig. 1. Presentation of a course unit using the so called Longpage Moodle plugin. Data collection and pre-processing: User interactions within the Moodle en- vironment have been captured in the database, especially in the standard log store. To capture real usage data on users’ reading behavior, we used the Intersec- tion Observer API that is available in a modern web browser. The Intersection Observer fires log events as soon as a text section becomes visible within the viewport of the users’ display. Text sections had a unique identifier and con- tained individual paragraphs, headlines, lists, images, and listings. The dataset consisted of 238,166 log entries from reading and quiz activities related to 1,359 individual user sessions. A user session is defined by a continuous sequence of consecutive log entries related to the quiz or reading activities of a user with a time difference between subsequent log entries of less than 45 minutes [7]. Fig. 2 shows an example of a reading session (being a sub-session of a user session) Interactions of reading and assessment activities 5 indicating the beginning and end of reading sessions in comparison to shorter activity breaks. Reading sessions were categorized based on tertiles of reading durations in minutes (intervals: (0, 1] < (1, 5] < (5, 60]). From these activities and sessions, the events shown in Tab. 1 were derived (10,079 in total). Fig. 2. Example of a reading session derived from scroll events indicating reading begin and end as well as activity breaks Table 1. Definition of reading and quiz events within a session Event Description N reading start first scroll event on page after at least 10min 987 reading short < 1min time window of scroll events on particular text page 576 reading medium 1-5min time window of scroll events on particular text page 266 reading long > 5min time window of scroll events on particular text page 355 reading pause < 5min break without scroll events on particular text page 210 reading continue first scroll event after break on particular text page 210 reading end > 10min break without scroll events on particular text page 987 quiz start first time of opening particular quiz 2,785 quiz repeat same repeated quiz attempt of same quiz after success or fail 567 quiz repeat other repeated quiz attempt of a quiz unrelated to the last one 313 quiz success submitted a solution that is more than 80 % correct (cf. [8]) 1,473 quiz fail submitted a solution that is less than or equal to 80 % correct 1,350 3.2 Mining processes and sequences To gain further insights into common patterns of user sessions, different methods may be used: trace or profile clustering ([13, 17]), where a trace or profile could be the vector of frequencies of events per session, or the so-called hypothesis-driven approach from [1] of finding study patterns by labeling activity sub-sequences. The latter was used in this study. Thus, from each user session, nominal fea- tures were manually generated according to the following sequential properties: whether the sequence of events per session starts with, ends with and/or mainly consists of reading or quiz activities; and in which tertile the length of the se- quence of events per session belongs (intervals: [1, 3] < (3, 7] < (7, 84]). As there is no prior knowledge of how to best cluster the dataset, the unsupervised method k-means was used for clustering user sessions by their properties. The 6 N. Seidel & D. Menze number of clusters was determined by the distortion score and silhouette score ([11]). The distortion score is defined as the mean of the sum of squared distances of data points to the center of the cluster. Each cluster, containing a subset of user sessions, was further classified as to whether it contained mainly quiz activities or reading activities, or both. Next, on each cluster, HeuristicsMiner [16] was used for visualizing the sessions as state transition charts. This method identifies state transitions between pairs of states A, B above a defined dependency threshold calculated as follows: |A → B| − |B → A| dependency(A, B) = (1) |A → B| + |B → A| + 1 Since only pairwise activity sequences are considered when looking at state tran- sitions, we also analyzed longer activity sequences using PrefixSpan algorithm [9] for mining sequential patterns. To determine the most frequent sequences with a minimal length of three activities, the Support measure has been employed. The Support is defined as the proportion between the number of sequences con- taining a certain sequence and the number of all sequences. It ranges from 0 to 1, where 0 means that the sequence did not occur at all and 1 that the sequence occurred in all patterns. 4 Results After presenting very brief results from a descriptive analysis, the results of the process and sequence mining of user sessions will be presented in the second part of the section. The anonymized data and analysis scripts used are publicly avail- able at https://anonymous.4open.science/r/A0F145 (last accessed 2022/07/01). 4.1 Descriptive Analysis For a descriptive analysis of the quiz activities, we refer to one of our previous papers [5]. Fig. 3 shows how much of the estimated reading time students spent on the course unit (CU) texts during the whole semester. For CU1, for instance, 11% of students had reading sessions equal to or longer than 100% of the esti- mated CU reading time.1 Referring to the empirical results of Trauzettel et al. [15], reading time for German language was estimated by the average number of characters read per minute minus the variance of reading speed. We assumed a reading speed lower than average because of the comparatively high text com- plexity of academic texts. 4.2 Mining processes and sequences For k ranging from 2 to 20, we used a silhouette analysis [11] to find the best k representative clusters for the supplied dataset, using measurements of average 1 Note that the percentage of students in Fig. 3 does not add up to 100 because less than 5% reading time is not represented. Interactions of reading and assessment activities 7 Fig. 3. Percentage of estimated reading time spent on page. distance inside the clusters and the average distance between the clusters. For k = 6, the best result was obtained (silhouette score = 0.89, distortion score = 111.63). As shown in Tab. 2, the size of the clusters is represented by the num- ber of user sessions ranging from 94 to 394. The sessions covered by each cluster have been performed by a significant portion of participants (23.9 % – 66.9 %). For visualizing the transitions between the actions in a cluster, significant tran- sitions are drawn in a transition state diagram using the HeuristicsMiner [16] with a dependency threshold of 0.9. The action sequences in every cluster were further classified as to whether they contained mainly quiz activities (SC1, SC3, and SC4) or reading activities (SC2 and SC6), or both (SC5). From the session clusters, state transition charts have been derived to support the interpretation of the data. The Fig. 4–7 show the state transition charts. The states corre- spond to the events listed in Tab. 1, while transitions are indicated by outgoing arrows pointing to the subsequent activity. The labels of the transition show the probability for a transition to the respective state with a range between 0 and 1. A value of 0 would indicate no transition to another state, while a value of 1 would indicate that a transition exists to a single connected state. Transition with probabilities below the threshold of 0.01 have not been displayed to improve readability. Table 2. Session clusters (SC) Session clusters SC1 SC2 SC3 SC4 SC5 SC6 Total Sessions (%) 234 207 394 261 169 94 1359 (17.2) (15.2) (29.0) (19.2) (12.4) (6.9) (100) Users (%) 82 (57.8) 71 (50.0) 95 (66.9) 71 (50.0) 34 (23.9) 48 (33.8) 142 (100) SC1 only contains quiz activities (Fig. 4). 22 % of those who call up a quiz continue to search other quizzes until they find one that suits them. If they fail a quiz, they continue with another one. The majority of those who accomplish a quiz successfully go for another one. Only a small fraction of 4 % (72 cases) look for mastery and repeat the same quiz in order to achieve the full score. Overall, the participants selectively go through the offered quizzes and perform some of them. 8 N. Seidel & D. Menze Fig. 4. Session cluster 1 Fig. 5. Session cluster 2 Fig. 6. Session cluster 3 Fig. 7. Session cluster 4 Fig. 8. Session cluster 5 Fig. 9. Session cluster 6 Reading-only activities are part of SC2 (see an example session in Fig. 2, and Fig. 5). One-fifth each starts reading for a long and medium duration of time. 60 % read only for a short time. No further findings could be derived from the three mined sequences. Since the learners do not interrupt their reading activities by pauses, another course unit, or quizzes, the activities shown in the sessions in SC2 indicate a directed and deliberate behavior shown by half of the participants. SC3 is the largest cluster in size and the second quiz-only cluster (Fig. 6). In almost half of the sessions, attempting a quiz will terminate the session without submitting a solution. 93 % of the sessions with a failed attempt will not be continued. Only 7 % who fail go for another quiz. 15 % of the successful attempts lead to the start of another attempt, but the majority terminates the session. For this session cluster, only sequences of length two were identified. SC3 is characterized by canceling quizzes after reading the quiz description or doing Interactions of reading and assessment activities 9 one attempt. About two-thirds of the participants showed this behavior. SC4 as the third, shown in Fig. 7, quiz-only cluster represents almost one-fifth of the sessions. About the same percentage of learners who start an attempt are successful or not. 42 % of those who fail try another quiz, while 45 % retry the same quiz to improve. After repeating the same quiz, 59 % are successful, but 35 % fail again. Only 5 % switch to another quiz. 4 % of those who successfully complete a quiz go for mastery to become even better. Half of the participants performed sessions as described in SC4. In these sessions, they intensively try to improve their quiz performance and look forward to tackling multiple quizzes. SC5 as represented in Fig. 8 is related to both, reading and quiz activities, but almost all sessions (95 %) start with reading for a short (43 %), medium (22 %), or long (35 %) time. Quiz attempts are made after readings end (15 %) or during reading long (27 %) or a reading pause (12 %). 38 % of the quiz attempts are successful and 39 % of those are followed by another quiz attempt. Learners in this SC do not submit solutions that let them fail. The activity quiz fail is not present at all. Instead of risking a failed quiz, they end reading or go for another quiz. 16 % continue to read the previously used text, while 34 % go for another text after the successful completion of a quiz. SC5 is characterized by a strong interaction of reading and quiz activities. SC6 refers to reading-only activities (Fig. 9) and describes transitions between short, medium, and long reading phases after a reading break. Breaks from reading follow 29 % of the long and 14 % of short reading phases. After a reading break, one-third of each continued reading for a short, medium, and longer time. The user sessions assembled in this session cluster are characterized by reading multiple course units within a session. No further findings could be derived from the eight mined sequences. 5 Discussion For analysis, we applied methods like process mining, sequence mining, and clustering. In the particular context of online learning in distance education, the application of these methods and their results require a critical reflection. First of all, this analysis frames reading and quizzes as the main course activities but ignores other activities like newsgroup discussions, assignments, live sessions, and self-regulated learning support. Session identification is an important phase in web usage mining, which can be done using time-oriented heuristics with session timeout intervals, as in our work, or navigation-oriented heuristics that consider the connection between con- tent [7]. The latter would regard course units, chapters, etc. as logical boundaries within the course, which takes into account that sometimes a student finishes reading in one day and then resumes learning later with quiz activities. This would provide a better way to examine student behavior for somewhat indepen- dent course units. However, we preferred time-oriented heuristics because some quizzes refer to multiple course texts, and most importantly, we view a learning session as a continuous stream of consciousness within which short-term memory 10 N. Seidel & D. Menze is filled with temporary content to make associations between concepts for the purpose of learning. This brings the temporal aspect to the fore. Process mining better characterizes the whole session in abstract terms, while sequence mining identifies frequent shorter sequences of the sessions. By applying both methods, we could combine the advantages of both approaches. We con- sidered the start, end, and proportion of reading and quiz activities as nominal features for clustering user sessions. Other approaches employed the frequency of activities (trace profile) for clustering, or process mining on each user ses- sion and then clustering the dependency matrix. However, the selected features include structural characteristics that indicate the intention for initiating and terminating a session. Furthermore, we were able to consider a comparatively large set of unique activities. Although the middle part of a session has been roughly mapped as a clustering feature, we employed sequential pattern mining to describe frequent patterns. Representative user sessions representing the most frequent reading and quiz behaviors (ignoring other types of activities which are also much less used by students) could be identified through clustering as a means of answering RQ1. Concerning quizzes, we found selective behavior to find preferred quizzes (SC1), canceling inappropriate quizzes (SC3), and intensive quiz sessions (SC4). Read- ing behaviors manifested in sessions with either a single course unit (SC2) or multiple course units (SC6). Also, interactions between reading and quiz activ- ities could be observed (SC5). These behaviors can be associated with learning strategies indicating learning progress and difficulties and thus used for adap- tive learning support. A prevalence of sessions that are less conducive to learning (e.g., SC1 or SC3) or a low variance in types of sessions (e.g. SC1–SC4, SC6) and thus solely reading or assessment activities could be countered by adaptive suggestions for appropriate learning strategies: students that often fall into clus- ters that are not beneficial for learning (e.g. skipping many quizzes, too many or few retries, etc.) could be supported by a reflection of their behavior. The analysis underlines the importance of assessments in terms of quizzes. However, the knowledge required to successfully perform the quizzes was pro- vided in the course texts. In the present analysis, we did not consider these semantical interrelations between quizzes and course texts. After quizzes, the provided reading facilities proved to be important in about one-third of the sessions and for half of the participants. Although the partici- pants tend for short reading phases, considerable medium and long reading spans could be observed. However, in the majority of user sessions, learners refuse to read course texts in Moodle. Likely, they do not refuse to read at all, since they successfully accomplished some of the quizzes. Despite of the added values pro- vided in the Longpage plugin, they may have preferred the print or PDF versions of the course texts. Note, that the printed texts contained QR-Codes pointing directly to quizzes in the Moodle course. A strong interplay of reading and quizzes could not be confirmed by our anal- ysis. Although some clusters of students used quizzes and course texts within Interactions of reading and assessment activities 11 the same user sessions, the mixture appears not as a predominant behavioral pattern concerning the user sessions. In this study, quiz and text difficulty were not measured directly. In order to examine the extent to which certain behaviors are related to a particular course unit (perhaps some units are difficult so that students need to try the quiz again, and others are easy so that even if students fail, they think they do not need to try the quiz again) in a preliminary analysis, the correlation between the pro- portion of session clusters per course unit and the average correctness of answers was calculated. As a result, course units 2 and 3 were significantly more difficult than course units 1 and 4 by this measure: 2 and 3 had on average significantly about 5% fewer correct answers than 1 and 4, which was strongly negatively correlated with the relative frequency of session cluster 1 and strongly positively correlated with session cluster 5. Thus, more difficult content led to many dif- ferent quizzes being taken per session, but less mixing of reading and quizzing in one session. In a more detailed analysis, we could distinguish the paths from quiz to text and vice versa. For instance, the course texts contain several hyperlinks to cor- responding quizzes, while the feedback to a submitted quiz include hyperlinks to the corresponding text sections. From the perspective of self-regulated learn- ing, these cross-links are important and could be adaptively emphasized during corresponding reading and assessment activities. 6 Conclusion and Outlook In this study, we aimed at analyzing interactions of reading and assessment that are potentially relevant to adaptively support learners. We could identify six session clusters of reading and quiz activities in our data (RQ1), which we further classified as comprising mainly quiz, mainly reading, or both reading and quiz activities. They showed subtle differences in their respective quiz and reading sequences: how many quizzes were tackled per session, if quizzes were repeated to deepen knowledge after a success or a fail, and how much reading per session and with or without breaks occurred. A strong relationship between reading and quiz activities per session could not be found. In a further study, the found patterns (session clusters) should be tried to be replicated in the following semester of the same course with a different cohort. The prediction of session clusters from the first log events could be elaborated using decision trees to investigate possible interventions more concretely. The dependence between behaviors exhibited in session clusters and difficulty of the material raised in the discussion needs to be studied in more depth. Correlations of found patterns with other factors like grades, assignment results and course re- enrollment should be studied. Also, a correlation between the observed frequency of certain clusters over the course of a semester per student and their dropout rate would be important for early intervention by teachers. Furthermore, existing cross-links between quizzes and texts could be investigated. Since the winter term 2021/22, the Longpage plugin supports text highlighting and co-reading 12 N. Seidel & D. Menze through threaded anchored discussions and visualization of group-related reading progress. This will enable further analysis of reading behavior in terms of the expected added value for learning. With the presented analytical approach, we hypothesized types of user ses- sion with similar reading and quiz behavior that could be used as indicators for adaptive personalized learning support. Adaptation could scaffold flexible participation and self-regulation through the close interaction of reading and quizzes. Acknowledgements This research was supported by the Research Clus- ter “Digitalization, Diversity and Lifelong Learning – Consequences for Higher Education” (D²L²) of the FernUniversität in Hagen, Germany. References [1] Boroujeni, M.S., Dillenbourg, P.: Discovery and temporal analysis of latent study patterns in MOOC interaction sequences. In: Pro- ceedings of the 8th International Conference on Learning Analytics and Knowledge. pp. 206–215. ACM, New York, NY, USA (3 2018). https://doi.org/10.1145/3170358.3170388 [2] Boticki, I., Akcapinar, G., Ogata, H.: E-book user modelling through learning analytics: the case of learner engagement and reading styles. Interactive Learning Environments 27, 754–765 (8 2019). https://doi.org/10.1080/10494820.2019.1610459 [3] Cheng, H.N.H., Liu, Z., Sun, J., Liu, S., Yang, Z.: Unfolding online learning behavioral patterns and their temporal changes of college stu- dents in spocs. Interactive Learning Environments 25, 176–188 (4 2017). https://doi.org/10.1080/10494820.2016.1276082 [4] Fouh, E., Breakiron, D.A., Hamouda, S., Farghally, M.F., Shaffer, C.A.: Exploring students learning behavior with an interactive etextbook in com- puter science courses. Computers in Human Behavior 41, 478–485 (2014) [5] Haake, J.M., Seidel, N., Burchart, M., Karolyi, H., Kasakowskij, R.: Accu- racy of self-assessments in higher education. pp. 97–108 (2021) [6] Kovacs, G.: Effects of in-video quizzes on MOOC lecture viewing. In: L@S 2016 - Proceedings of the 3rd 2016 ACM Conference on Learning at Scale. pp. 31–40 (2016). https://doi.org/10.1145/2876034.2876041 [7] Kovanović, V., Gašević, D., Dawson, S., Joksimović, S., Baker, R.S., Hatala, M.: Penetrating the black box of time-on-task estima- tion. In: ACM International Conference Proceeding Series. vol. 16-20- Marc, pp. 184–193. Association for Computing Machinery (mar 2015). https://doi.org/10.1145/2723576.2723623 [8] Maldonado-Mahauad, J., Pérez-Sanagustı́n, M., Kizilcec, R.F., Morales, N., Munoz-Gama, J.: Mining theory-based patterns from Big data: Identifying self-regulated learning strategies in Massive Open On- line Courses. Computers in Human Behavior 80, 179–196 (2018). https://doi.org/10.1016/j.chb.2017.11.011 Interactions of reading and assessment activities 13 [9] Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C.: PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. Proceedings - International Conference on Data Engineer- ing pp. 215–224 (2001). https://doi.org/10.1109/icde.2001.914830 [10] Rieger, M.C., Seidel, N.: Semantic Textual Similarity von textuellen Lern- materialien. In: Pinkwart, N., Konert, J. (eds.) Die 17. Fachtagung Bildung- stechnologien, Lecture Notes in Informatics (LNI). pp. 33–44. Gesellschaft für Informatik, Bonn (2019) [11] Rousseeuw, P.: Rousseeuw, p.j.: Silhouettes: A graphical aid to the inter- pretation and validation of cluster analysis. comput. appl. math. 20, 53-65. Journal of Computational and Applied Mathematics 20, 53–65 (11 1987). https://doi.org/10.1016/0377-0427(87)90125-7 [12] Seidel, N., Rieger, M.C., Walle, A.: Semantic Textual Similarity of Course Materials at a Distance-Learning University. In: Thomas W. Price, Pe- ter Brusilovsky, Sharon I-Han Hsiao, Ken Koedinger, Shi, Y. (eds.) Pro- ceedings of 4th Educational Data Mining in Computer Science Educa- tion (CSEDM) Workshop co-located with the 13th Educational Data Min- ing Conference (EDM 2020), Virtual Event, July 10, 2020. CEUR-WS.org (2020), http://ceur-ws.org/Vol-2734/paper6.pdf [13] Song, M., Günther, C.W., Van Der Aalst, W.M.: Trace clustering in pro- cess mining. In: Lecture Notes in Business Information Processing. vol. 17 LNBIP, pp. 109–120 (2009). https://doi.org/10.1007/978-3-642-00328-8 11 [14] Sun, J.C.Y., Lin, C.T., Chou, C.: Applying learning analytics to explore the effects of motivation on online students’ reading behavioral patterns. International Review of Research in Open and Distributed Learning 19, 209–227 (4 2018) [15] Trauzettel-Klosinski, S., Dietz, K., the IReST Study Group: Standardized Assessment of Reading Performance: The New International Reading Speed Texts IReST. Investigative Ophthalmology & Visual Science 53(9), 5452– 5461 (08 2012). https://doi.org/10.1167/iovs.11-8284 [16] Weijters, A., van der Aalst, W.M.P., de Medeiros;, A.K.A.: Process Mining with the HeuristicsMiner Algorithm (2006) [17] Xu, J., Liu, J.: A Profile Clustering Based Event Logs Repairing Approach for Process Mining. IEEE Access 7, 17872–17881 (2019). https://doi.org/10.1109/ACCESS.2019.2894905 [18] Yang, A.C.M., Chen, I.Y.L., Flanagan, B., Ogata, H.: Automatic genera- tion of cloze items for repeated testing to improve reading comprehension. Educational Technology & Society 24(3), 147–158 (2021) [19] Yu, C., Balakrishnan, R., Hinckley, K., Moscovich, T., Shi, Y.: Implicit bookmarking: Improving support for revisitation in within-document read- ing tasks. International Journal of Human-Computer Studies 71, 303–320 (2013). https://doi.org/10.1016/j.ijhcs.2012.10.012