Visualizing Lecture Capture Usage: A Learning Analytics Case Study Abstract This paper outlines our initial investigations of applying information visualization techniques to lecture capture video systems. Our principal goal is to better understand Christopher Brooks Department of Computer how students use these systems, and what visualizations Science make for useful learning analytics. We apply three University of Saskatchewan different methods to viewership data aimed at Saskatoon, SK, Canada understanding student rewatching behaviour, temporal cab938@mail.usask.ca patterns for a single course, and how usage can be compared between groups of students. Craig Thompson Department of Computer Author Keywords Science Lecture Capture; Information Visualization; Usage Data; University of Saskatchewan Saskatoon, SK, Canada E-Learning craig.thompson@usask.ca ACM Classification Keywords Jim Greer K.3.1 [Computers and Education]: Computer Uses in Department of Computer Education.; H.5.1 [Multimedia Information Systems]: Science Video (e.g., tape, disk, DVI). University of Saskatchewan Saskatoon, SK, Canada Introduction jim.greer@usask.ca The activities described in this paper fall into a larger program of research focused on determining the efficacy Copyright c 2013 for the individual papers by the papers’ of lecture capture on student learning. We are interested authors. Copying permitted only for private and academic in using both statistical techniques (e.g. machine purposes. This volume is published and copyrighted by its learning) and visual techniques (e.g. information editors. WAVe 2013 workshop at LAK13, April 8, 2013, visualization) to understand the patterns of interaction Leuven, Belgium. 9 learners have with lecture video. This paper introduces by summing the number of diagonal lines over a region of three activities we have undertaken to visualize learner the y dimension. If learners spend a session watching data. The data for this paper comes from Recollect, a some video and “pick up” from where they left off, there lecture capture environment we developed for research use will be no gap in the y dimension. If they re-watch from in 2010 and 2011. Described more fully in [2], the data content they have previously seen there will be some we use is generated by the Recollect heartbeat, a overlap of the diagonal lines, and if they skip ahead there client-side event that occurs every 30 seconds and records will be space in between each diagonal in the y dimension. (among other things) which video a user is playing, the location of the playhead within the video, and whether the Figure 1 shows graphs of student interaction data using video is paused or not. This structure has been ported to three different students and three different videos. The the freely available Opencast Matterhorn project1 , which first image, Figure 1a, shows many long diagonal lines is the focus of new research on lecture capture analytics indicting the learner tends to navigate to the position of Case Study: Chemistry 200 at our institution. interest and watch for an extended period of time. These lines overlap heavily along the y dimension, indicating the One course we examine in this Case 1: Visualizing Rewatching Behaviour student has rewatched significant portions of the video, work is a second year un- Motivated by the correlative link between re-reading some up to four times. There are also long horizontal dergraduate Chemistry course. discussion forum messages and academic performance in lines indicating the student has paused the video for This course was taught using undergraduate students [1], we were in better extended periods of time. Due to the strong diagonals, we traditional lectures to 546 stu- understanding whether learners re-watch lecture capture have labelled this activity as regular rewatcher. dents in multiple sections. One content. Summary statistical usage of our lecture capture Figure 1b show a different kind of student, who has instructor elected to have his facilities shows that some students playback significantly more video than others (see sidebar). Our question is viewing sessions with a low slope, indicating they watch lectures recorded and provided the video in less than real time. Our system didn’t allow to all of the students as a study whether there are any meaningful patterns in how learners view lecture videos. for variable speed playback, and deeper investigation aid. Examinations and assign- shows that the learner both pauses the video frequently ments between sections were and slowly seeks through the video, perhaps spending To visualize lecture watching activity, we plotted all of the the same. time on particular segments to transcribe content or apply sessions for an individual user/video pair using a scatter plot, with the x axis representing time within a video it to a problem they may be working on. Despite this, the Only 333 watched video con- playback session in minutes, and the y axis representing learner has both rewatched one section of the video tent for at least five minutes, the time within the lecture that was being watched. A completely (given by the two overlapping lines), and a participation rate of 61%. In 45◦ diagonal line represents viewing the video without ended up playing back all of the video content. Due to total, learners watched over 77 pausing, while horizontal lines indicate the video was the perceived activity of the learner, we refer to this days worth of lecture video, a paused. By colour coding sessions and overlaying their student as a engaged rewatcher. remarkable number given that only 38 lectures of 50 minutes plots on top of one another, we can identify how many each were recorded. times a student has watched a particular portion of video 1 http://www.opencastproject.org 10 Student: Orson Video: 2758138 Finally, figure 1c shows a learner who makes strong use of 58:20 the pause feature of the player. While it was surprising to 50:00 us, it is not uncommon to find learners who open multiple 41:40 videos at once and then pause video for hours, even days, Video Time 33:20 and come back to continue playing the video. This learner 25:00 exemplifies this behaviour, with many sessions containing 16:40 long horizontal lines, some turning into diagonals after 08:20 extended periods. While our visualizations are truncated to 120 minutes in order to maintain a 1:1 ratio between 00:00 00:00 16:40 33:20 50:00 66:40 83:20 100:00 116:40 Playback Duration axes, there are many learners who tend to follow this (a) pauser rewatcher. Student: Joan Video: 6037716 58:20 Case 2: Temporal Patterns of Viewership 50:00 In addition to understanding how learners watch individual 41:40 videos, we are interested in understanding how learners watch videos over the span of the course. Previous work Video Time 33:20 has been done by others on visualizing intravideo 25:00 navigation [4], but we’re interested in intervideo usage 16:40 patterns, such as periods of high activity prior to 08:20 examinations and assignments. We are motivated in part 00:00 by our previous work, which has demonstrated that there 00:00 16:40 33:20 50:00 66:40 83:20 100:00 116:40 Playback Duration is a statistically significant positive correlation between (b) academic achievement and habitual weekly viewing of Student: William Video: 1782956 lecture videos [3]. 58:20 50:00 To visualize temporal intervideo usage, we created three 41:40 dimensional heatmaps for each course. These maps plot Video Time 33:20 the time a video was made available to students (y axis), 25:00 the time at which students watched that video (x axis), and the the total time that video was watched (colour 16:40 axis). Data was binned to one day intervals and, while we 08:20 did not hold the axes equivalent as we did in the previous 00:00 00:00 16:40 33:20 50:00 66:40 83:20 100:00 116:40 example, a strong diagonal line represents learners Playback Duration watching lectures as soon as they become available, and (c) the more filled in the lower right triangle of usage is, the Figure 1: Rewatching graphs for three different learners. more lectures were revisited. 11 While a number of different patterns were observed, here captured lectures is correlated with academic we present three distinct patterns. The first kind of course achievement. In this course we don’t observe the same shown in Figures 2a and 2c demonstrate the most typical amount of “cramming” when regular evaluation is applied, pattern of interaction. Here, learners rewatch lectures and we are interested in determining if it is the domain from earlier in the term as evaluations (midterm and final that causes this change, or the pedagogical approach. exams) approach. While there is some watching of early lecture content throughout the term, we notice large Throughout most of these visualizations, we see a trend of patches of blue (cool, or minimal) usage of content that high viewership (high temperature colour) early on in the was recorded before the midterm evaluation once the course, at evaluation points, and at the end of the course. midterm has been delivered. While there is some viewership of this content right before the final exam, this Case 3: Comparing Usage Between Groups activity is minimal. of Students Our final visualization aims to shed light on how students The second kind of course is shown in Figure 2b, that has in different identifiable groups use lecture capture facilities particularly strong viewership of only the most recent differently. We are particularly interested in comparing video right before the final examination. This suggests a high achieving learners (those who achieved an very important lecture at the end of course was given exceptional pass of the course with a 87.5% or higher (perhaps a comprehensive review of topics), or that topics grade) to low achieving learners (those who achieved are from early on in the course will not be tested. As this marginal pass of the course with a 50% – 62.5% grade). course was an introductory programming course in Computer Science, it is quite likely that the early portion We plotted a histogram of viewership activity for each of the course focused on fundamental skills, while the group. We concatenated the lengths of each lecture latter half of the course required execution of these skills together2 , to form a continuous x axis of 28.5 hours of in programming assignments (leading to a reduction in video. Each 15 minutes along the axis denotes a single watching of video content). histogram bin, viewership for that bin is equal to the number of heartbeats we would expect if every student in The last kind of course, an introductory Calculus class for that group watched the whole 15 minutes (e.g., 30 non-majors, involved regular forms of evaluation spaced heartbeats per student). This captures both initial roughly every two weeks. Here, viewership patterns follow watching and rewatching behaviour, but not behaviour a diagonal band, where early content is rarely watched where the video is paused. The two histograms were then later on in this course. This suggests to us that the plotted simultaneously with alpha channelling to see content either comes in distinct “chunks” which are commonalities – in Figure 3 the red plots indicate unrelated, or that early content is fundamental to the viewership by high achieving students, the blue plots later content and doesn’t need to be revisited as the indicate viewership by low achieving students, and the course progresses. This pattern is most interesting to us shared viewership patterns are shown in purple. given our previous studies indicating regular viewership of 2 For data cleaning we limited the length of a lecture to 45 minutes. 12 66:40 75:00 Introductory Computer Science Introductory Biology 66:40 58:20 midterm final Nov 26 Nov 25 Nov 18 58:20 Nov 19 50:00 Nov 11 Nov 12 Nov 04 50:00 Nov 05 41:40 Published date Published date Oct 28 Oct 29 41:40 Oct 22 Oct 21 33:20 Oct 14 Oct 15 33:20 Oct 07 Oct 08 Sep 30 25:00 Oct 01 25:00 Sep 23 Sep 24 Sep 16 Sep 17 16:40 16:40 Sep 09 Sep 10 Dec 02 Oct 21 Nov 11 Nov 25 Oct 07 Sep 23 Oct 28 Nov 18 Dec 09 Sep 30 Sep 16 Dec 16 Oct 14 Nov 04 08:20 Oct 22 Nov 12 Oct 01 Oct 15 Nov 05 Sep 17 Dec 03 Nov 19 Oct 08 Oct 29 Dec 10 Nov 26 Sep 24 08:20 Video time watched Video time watched 00:00 00:00 (a) (b) 53:20 37:30 Second Year Geology Introductory Calculus 33:20 Apr 06 46:40 Dec 02 midterm final exam 1 exam 2 exam 3 exam 4 examfinal 5 Mar 30 Nov 25 29:10 Mar 23 40:00 Nov 18 Mar 16 Nov 11 25:00 Mar 09 33:20 Nov 04 Published date Published date Mar 02 Oct 28 20:50 Feb 23 Oct 21 26:40 Feb 16 Oct 14 16:40 Feb 09 Oct 07 20:00 Sep 30 Feb 02 12:30 Jan 26 Sep 23 Jan 19 13:20 Sep 16 08:20 Jan 12 Sep 09 Feb 02 Mar 02 Feb 23 Mar 23 Apr 13 Feb 09 Mar 09 Mar 30 Apr 20 Feb 16 Mar 16 Apr 06 Jan 19 Jan 26 Dec 02 Oct 21 Nov 11 Nov 25 Oct 07 Sep 23 Dec 23 Oct 28 Nov 18 Dec 09 Sep 30 Sep 16 Dec 16 Oct 14 Nov 04 06:40 04:10 Video time watched Video time watched 00:00 00:00 (c) (d) Figure 2: Heatmaps for four cohorts demonstrating the relationship between publication date of video and viewing date by students. 13 Conclusions References Through these visualizations, we have been able to gain [1] J. Bowes. Knowledge management through data insight into how learners used lecture capture, how this mining in discussion forums. Master’s thesis, aligns with activities over an academic term, and how University of Saskatchewan, 2001. student populations differ in their use of lecture capture [2] C. Brooks, C. D. Epp, G. Logan, and J. Greer. The systems. Applying visual analytics to “big data” problems who, what, when, and why of lecture capture. In is not without caveats – the effects of parameters for Proceedings of the 1st International Conference on charts including time offsets, resolution on heartbeat data, Learning Analytics and Knowledge - LAK ’11, pages aggregation into bins for heatmaps and histograms, and 86–92, New York, New York, USA, 2011. ACM Press. determining the right data to process make discussion and [3] Christopher. A Data-Assisted Approach to Supporting prototyping essential steps in the process. Instructional Interventions in Technology Enhanced Learning Environments. PhD thesis, University of How to provide visual learning analytics to different Saskatchewan, 2012. stakeholders is also an issue we are carefully considering. [4] P. B. Robert Mertens, Markus Ketterl. Social Currently, our work is aimed at instructional designers and navigation in web lectures: a study of virtpresenter. instructors who are deeply interested in their courses. We Interactive Technology and Smart Education, are interested in also showing visualizations to students 7(3):181–196, 2010. and instructors to help them gain insight into their learning and teaching and how that relates to usage of technology like lecture capture. Figure 3: Overlapping histograms showing the viewership of high achieving students (red), low achieving students (blue) and that viewership common to both groups (purple). Videos of the Chemistry 200 course were used for this plot. 14