Videolectures Ingredients that can
                              make Analytics Effective

Marco Ronchetti                                                           Abstract
Dipartimento di Ingegneria e                                              Videolectures over the Internet started at the turn of
Scienza dell’Informazione                                                 the century and became more and more popular, until
Università degli Studi di Trento                                          they recently obtained a wide echo in the form of
Trento, IT- 38050, Italy                                                  Massive Open On-Line Courses (MOOCs). Although
marco.ronchetti@unitn.it                                                  videolecture usage data have always been important, in
                                                                          the case of MOOCs they are vital for the success of the
                                                                          initiative. In the present paper, we suggest that some
                                                                          (already available) tools for the extraction of semantic
                                                                          information from the video should be used, as they
                                                                          may vastly improve the meaningfulness of the
                                                                          information extracted from videolecture analytics.

                                                                          Author Keywords
                                                                          Videolectures, effective analytics, semantics, MOOCs

                                                                          ACM Classification Keywords
                                                                          K.3.1 [Computers and Education]: Computer Uses in
                                                                          Education - Computer-managed instruction (CMI).

                                                                          Introduction
Copyright © 2013 for the individual papers by the papers' authors.        The idea of massively using videos of recorded lectures
Copying permitted only for private and academic purposes.
                                                                          for teaching goes back to the attempts to use TV as an
This volume is published and copyrighted by its editors.
                                                                          educational medium. The TV introduced some
WAVe 2013 workshop at LAK’13, April 8, 2013, Leuven, Belgium.
                                                                          educational programs (and later channels), but only in
                                                                          rare occasions they were a success. A such case was
                                                                          the Italian TV show “Non è mai troppo tardi” (It’s never
                                                                          too late) which from 1960 to 1968 brought more than a


                                                                     15
million of illiterates to achieve a primary school degree        experiments [3, 13] a lot of research has been done on
(probably one of the most successful examples of TV-             the Internet carried videolectures field (for a review see
based distance education ever, and a sort of early               [9, 10]).
MOOC –Massive Open On-line Course, even though the
“line” was not the Internet). Even before that there             It took then about 15 years for these videolectures to
were instructional movies – used for instance to                 pass from the work of the pioneers to the pages of the
demonstrate scientific experiments that were too                 New York Times [8]. They went progressively though a
complex or too lengthy to be performed in a school               larger and larger diffusion, with a first boost given
laboratory. Also today there are educational TV                  (around 2005) by the Apple iTunes-U initiative, which
channels, like Teachers TV : a digital channel for               also allowed extracting some usage data from the logs,
everyone who works in schools. Teachers TV’s                     see e.g. [2]. Along the path, for a few years (starting
programmes cover every subject in the curriculum, all            again from 2005) the podcasting variant has been a
key stages and every professional teaching role. It can          fashionable approach. Only recently MOOCs finally
be accessed on digital cable and satellite (more recently        made it into the official dictionaries: the MOOC entry in
also via Internet).                                              the English Wikipedia dates July 2011. A history of
                                                                 MOOCs in 2012, the year of the boom, is reported in a
In the seventies, the use of VHS cassettes allowed for           post by Audrey Watters.
the first time to attempt transforming videos into an
“on demand” resource for satisfying educational needs,           MOOC Numbers
but again the effort had only a marginal impact on the           Figures such as “1.7 million students for Coursera” or
education mainstream.                                            “ratio students to professor 150.000:1 in Udacity” [8]
                                                                 are certainly impressive: however, in spite of their
At the end of the 80’s, a system that implemented a              popularity, there is little data on MOOCs. Stories of
rather mechanical process of individualized instruction          success and failure are often anecdotal. Some statistics
was patented [1]. Part of the system consisted in the            is available coming from MOOC platforms like Coursera,
ability to use some ad-hoc hardware to play movies.              Udacity and MITx, and they are puzzling.

Only in the nineties PCs had sufficient power and                The first MIT MOOC (MITx - 6.002x: Circuits and
memory space to consider them as tools that can be               Electronics.), boomed with 154,763 registrants. Only
used for reproducing videos and multimedia in general.           45% however (69,221 people) looked at the first
With the millennium turn the increased network                   problem set, and out of them only 26,349 earned at
bandwidth and the power of mobile devices (laptops               least one point (17% of the enrolled): we can consider
first, and then pads and smartphones) allowed                    these as the ones who manifested a real interest,
distributing videos over the Internet, which ultimately          rather than just a curiosity.
delivered today’s capability to use video instruction
anywhere and at any time. Since the early


                                                            16
The number halved by the midterm assignment                              Are there any portions of the videos that are
(13,569 people looked at it while it was still open and                   being watched repeatedly?
9,318 people got a passing score on the midterm - 6%                     Are the students watching the videos by the
of the enrolled).                                                         assigned deadlines?
                                                                         Do the videos generating active user
In the end, after completing 14 weeks of study, 7,157                     engagement?
people earned the first certificate (4,6% of the enrolled,               Do students edit, share, download the
i.e. 27% of those who really manifested interest). In                     material?
spite of the gigantic drop, having more than seven                The interpretation of the statistics may however be not
thousand students passing a course is a massive                   easy. Knowing that the sequence on lecture N at time
achievement indeed.                                               between t1 and t2 is often reviewed is not by itself a
                                                                  meaningful cue. What is there? To know, we need to
The numbers for Coursera’s Social Network Analysis                view ourselves the fragment. When the potentially
class are less encouraging. Out of the 61,285 students            interesting sections or points are many, this may be a
registered, 1303 (2%) earned a certificate, and only              very time-consuming task. The problem arises by the
107 earned "the programming (i.e. with distinction)               lack of semantic information.
version of the certificate” (0.17%).
                                                                  Some help may come from a low-granularity structure
MOOC questions and challenges                                     of the material. For instance, if “lectures” are broken
The statistics raise several questions. The most                  into small pieces (20 minutes) as in the case of Kahn
compelling one is probably “why aren’t a large number             Academy, or even less (10 minutes fragments, like in
of students finishing the course?”. This question may             certain Coursera cases), it is likely that each unit has a
be difficult to find a response to, but responses to other        well-defined semantics. Instead, if a lecture is recorded
inquiries can be obtained by monitoring the users’                in class, and hence follows time constraints which are
behaviour, and gathering statistics and analytics.                dictated by logistics rather than by content, things are
Examples of such queries are e.g. the following ones:             much more difficult.

       Where do the students come from?
                                                                  In these cases, substantial help may come from certain
       Which videos are most popular, and which ones             ingredients that we claim to be important ingredients of
        attract little interest?                                  the videolectures:
       Are students actually watching the videos on
        the assigned dates?                                              multiple (parallel) cognitive channels,
       Are viewers watching all the way through?                        semantic marking,
       At what point in the lecture, if any, do viewers
                                                                         transcripts,
        stop watching?
                                                                         annotations


                                                             17
Videolecture enhancements that may (also)                        efficacy brought by the presence of video as an
help analytics                                                   additional cognitive channels. We believe MOOCS
The ingredients we mentioned are not really new, as              should adopt such a rich communication paradigm, and
some people have been using them for years in the                not rely on the poorer paradigm based on a single
context of videolectures as tools for improving the user         video channel (+ audio).
experience. For instance, semantic annotation has been
used for facilitating lecture navigation (see e.g. [11]),        This choice would help introducing the second
and transcripts have helped searching a videolecture             ingredient: semantic marking. Having e.g. slides
(see later). However, in the light of analytics they             transitions makes it very easy to associate metadata to
assume a new dimension. Let us briefly examine them.             specific portions of a video. When a teacher presents a
                                                                 slide, what is s/he talking about? Most likely, we find
The first component we mentioned is multiple cognitive           the answer in the slide title. If slide transition timing,
channels. Typically on-line lectures in MOOCs focus on           and slide content, are captured while recording the
at exactly two channels: they are either video + audio,          video, it becomes extremely easy to tag the video with
slides + audio (the so called webcasts), or computer             semantic annotation. Questions like the ones we have
screen + audio (as e.g. in the case of the Kahn                  mentioned, e.g. “Are there any portions of the videos
Academy). There are even lectures bases on audio                 that are being watched repeatedly?” may have now a
alone (podcast), even though they were mostly used               significantly more interesting answer than “at time
before the success of the MOOC term.                             nn:nn”: the answer might rather be something like “the
                                                                 fragment discussing third Kepler law”. The power of
In contrast, even the snubbed frontal lectures in class          analytics suddenly is vastly increased, exactly because
are based on a richer paradigm. The teacher uses the             of the availability of semantic metadata. And the
blackboard, PowerPoint slides, may project his/her               important point is that such metadata – which are a
computer screen, and at the same time students see               resource which is notoriously difficult and costly to
gestures and facial expressions. It is quite possible to         obtain, are automatically generated!
reproduce such environment even in on-line lectures. A
variety of authoring systems allow using in parallel (at         On the same line, availability of (synchronized) audio
least) two visual channels (e.g. slides + video), making         transcripts allows associating meaningful information to
the on-line lecture richer. While Moreno and Mayer [7]           the timeline. A few years ago, we [5] successfully
suggested that the presence of multiple cognitive                experimented using Automatic Speech Recognition
channel brings a negative “split attention” effect,              tools to enrich videos with synchronized transcripts that
Glowalla [6], a German instruction psychologist,                 allowed students to perform searches into on-line
reported that lectures showing a video and slides                videolectures. This technique would of course also
favour learners show better concentration, while the             allow mapping any data coming from analytics on the
audio + slide version is perceived as more boring. Data          content without the need of visual inspection of video
obtained by other investigators [4] confirm the better           fragments. Natural language processing (NLP) tools


                                                            18
could be used to extract additional semantic                      World Conference on Educational Multimedia,
information from a specific video fragment.                       Hypermedia and Telecommunications 2011,
                                                                  Chesapeake, VA: AACE, 2011, p. 720-727.
Finally, we mention in passing that the possibility for           [3] M.H. Hayes. (1998) Some approaches to Internet
                                                                  distance learning with streaming media, Second IEEE
students to annotate video lectures would be a yet
                                                                  Workshop on Multimedia Signal Processing, Redondo
additional, precious source of information. Again, this
                                                                  Beach, CA, USA (1998).
would be a case of a feature that was originally
                                                                  [4] A. Fey. (2002) Audio vs. Video: Hilft Sehen beim
designed to achieve a particular goal (such as e.g. to
                                                                  Lernen? Vergleich zwischen einer audio-visuellen und
grow a community sense around a set of                            auditiven virtuellen Vorlesungen. Lernforschung, 30.
videolectures), and that would acquire an additional              Jhg (4):331–338 (in German)
value in the context of usage analysis that is typical for        [5] A. Fogarolli, G. Riccardi, M. Ronchetti. (2007).
analytics tools. This would be true for the extra                 “NEEDLE: Searching information in a collection of
information that NLP tools could mine from the notes,             video-lectures” in Proceedings of World Conference on
but in addition to that, data regarding annotation would          Educational Multimedia, Hypermedia and
per se be an extra source that could be mined (e.g. to            Telecommunications ED-MEDIA 2007, Norfolk (Va):
                                                                  AACE, 2007, p. 1450-1459.
find correlations with the difficulty or interest of a
particular video portion).                                        [6] U. Glowalla. (2004). Utility and Usability von E-
                                                                  Learning am Beispiel von Lecture-on-demand
                                                                  Anwendungen. In Entwerfen und Gestalten, 2004 (in
Conclusion
                                                                  German).
MOOCs may be just an ephemeral fashion, or might
                                                                  [7] R. Moreno, & R. E. Mayer. (2000). A Learner-
revolutionize the future landscape of higher education:
                                                                  Centred Approach to Multimedia Explanations: Deriving
only time will tell. In this short paper we advocated the         Instructional Design Principles from Cognitive Theory,
need for them to embrace a richer cognitive paradigm,             Interactive Multimedia Electronic Journal of Computer-
and to be enriched by metadata associated with video              Enhanced Learning (2).
fragments. The availability of such metadata, which               [8] L. Pappano. (2012). “The year of the MOOC”, The
should be automatically extracted, provides important             New York Times, Nov 2, 2012
hints that they make the information extracted by                 http://www.nytimes.com/2012/11/04/education/edlife/
videolecture analytics much more significant.                     massive-open-online-courses-are-multiplying-at-a-
                                                                  rapid-pace.html?pagewanted=all&_r=0
References                                                        [9] M. Ronchetti. (2011). "Perspectives of the
[1] A. Louis Abrahamson, Frederick F. Hantline, Milton            Application of Video Streaming to Education" in Ce Zhu,
G. Fabert, Michael J. Robson, Robert J. Knapp. (1989).            Yuenan Li, Xiamu Niu (a cura di), Streaming Media
“Electronic classroom system enabling interactive self-           Architectures, Techniques, and Applications: Recent
paced learning”, US Patent 5002491.                               Advances, Hershey PA, USA: Information Science
                                                                  Reference, IGI Global, 2011, p. 411-428.
[2] A. Defranceschi, M. Ronchetti. (2011). "Video-
lectures in a traditional mathematics course on iTunes            [10] M. Ronchetti. (2011). "Video-Lectures over
U: usage analysis" in Proceedings of EDMEDIA 2011 -               Internet: The Impact on Education" in G. Magoulas (a


                                                             19
cura di), E-Infrastructures and Technologies for Lifelong
Learning: Next Generation Environments, New York:
IGI Global, 2011, p. 253-270.
[11] M. Ronchetti. (2012). "LODE: Interactive
demonstration of an open source system for authoring
video-lectures" in Interactive Collaborative Learning
(ICL), 2012 15th International Conference on, Los
Alamitos, USA: Computer society Press of the IEEE,
2012, p. 1-5.
[12] D. McKinney., J.L. Dyck, & E.S. Luber. (2009).
iTunes University and the classroom: Can podcast
replace Professors? Computer & Education 52, 617-
623.
[13] F. Tobagi. (1995) Distance learning with digital
video. Multimedia, IEEE vol. 2 (1) pp. 90 – 93.


                                                            20