=Paper=
{{Paper
|id=Vol-3052/ipaper7
|storemode=property
|title=A Review on Recent Advances in Video-based Learning Research: Video Features, Interaction, Tools, and Technologies
|pdfUrl=https://ceur-ws.org/Vol-3052/paper7.pdf
|volume=Vol-3052
|authors=Evelyn Navarrete,,Anett Hoppe,,Ralph Ewerth
|dblpUrl=https://dblp.org/rec/conf/cikm/NavarreteHE21
}}
==A Review on Recent Advances in Video-based Learning Research: Video Features, Interaction, Tools, and Technologies==
A Review on Recent Advances in Video-based Learning Research: Video Features, Interaction, Tools, and Technologies Evelyn Navarrete1 , Anett Hoppe1,2 and Ralph Ewerth1,2 1 L3S Research Center, Leibniz University Hannover, Germany 2 TIB – Leibniz Information Centre for Science and Technology, Hannover, Germany Abstract Human learning shifts stronger than ever towards online settings, and especially towards video platforms. There is an abundance of tutorials and lectures covering diverse topics, from fixing a bike to particle physics. While it is advantageous that learning resources are freely available on the Web, the quality of the resources varies a lot. Given the number of available videos, users need algorithmic support in finding helpful and entertaining learning resources. In this paper, we present a review of the recent research literature (2020-2021) on video-based learning. We focus on publications that examine the characteristics of video content, analyze frequently used features and technologies, and, finally, derive conclusions on trends and possible future research directions. Keywords Video-based Learning, web-based learning, literature review, video features 1. Introduction (c) teachers who seek to enhance their teaching by inte- grating adapted multimedia resources, and finally, (d) stu- The Web has fundamentally changed the way we learn. dents who pursue a more effective learning process and Especially video platforms, such as YouTube, play an experience. increasing role here – from the about 30 million video In this paper, we provide a review of recent research views a day, about 50% relate to some kind of learning on the analysis of video-based learning, with a focus on content [1]. Another YouTube-related statistic states that studies that examine video characteristics. For this pur- as of May 2019, about 500 hours of new content were pose, we performed a systematic literature search. Here, uploaded to the platform every minute [2], implying that we present a preliminary summary of our findings from users are reliant on effective search and recommenda- the most recent publications, covering the years 2020 and tion algorithms to find content that is relevant to their 2021. We systematize the contents of 41 reviewed papers learning needs. and provide an overview on (a) the chosen research ap- These algorithms need to consider multiple factors to proach (e.g., empirical study, tool prototype, production provide suitable rankings and recommendations. Espe- guidelines), (b) considered video characteristics, and (c) cially in learning contexts, an open question is: When tasks and technologies used to develop VBL tools and is a video the best way to learn, depending on the indi- frameworks. From these dimensions, we derive recent vidual user, the learning objective, and context factors? research trends and identify gaps that provide directions This question has been investigated in several studies for future research. in recent years, examining characteristics of the videos The rest of this paper is structured as follows: Sec- and the surrounding platforms to determine when video- tion 2 presents the related work; Section 3 describes based learning (VBL) processes are especially success- our methodology; Section 4 provides an overview of the ful. The resulting insights are of interest for all the in- video-based learning field, its benefits, potentials, and the volved stakeholders: (a) Platform providers who wish to current challenges; and explains why identifying relevant provide their users with the best possible content from features in videos is important. Section 5 presents the their database, but also (b) content producers who aim results of the review. Finally, we summarize our findings, to develop efficient and entertaining learning resources, point out the limitations, and conclude with indications Proceedings of the CIKM 2021 Workshops, November 1–5, Gold Coast, for future research in Section 6. Queensland, Australia " navarrete@l3s.de (E. Navarrete); anett.hoppe@tib.eu (A. Hoppe); ralph.ewerth@tib.eu (R. Ewerth) 2. Related Work 0000-0002-5610-9908 (E. Navarrete); 0000-0002-1452-9509 (A. Hoppe); 0000-0003-0918-6297 (R. Ewerth) We found four survey articles that examine video-based © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). learning. Their core characteristics are summed up in CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) Table 1 Core information of related survey articles Reference & publication year, temporal scope of the survey, number of reviewed papers and reviewed characteristics Ref. Review period N° papers Review dimensions [3] (2013) 2000-2012 166 1) Type of research 4) Technology type 2) Sample 5) Style of use 3) Subject area [4] (2014) 2003-2014 76 1) Learning effectiveness 3) Design 2) Teaching methods 4) Reflection [5] (2018) 2007-2017 178 1) Video type 3) Sample size 2) Population 5) Video features 3) Control of prior knowledge [6] (2020) 2008-2019 39 Teacher perception and reflection Table 1. All of them reference the significant development The present review follows a similar format to the one in research on VBL [4] or the increase in publications presented by Poquet et al. [5], such that our results are fo- [5, 3] as a motivation for their work. Giannakos [3] and cused around the video characteristics that are analyzed Poquet et al. [5] find that there is an increase specifically or experimented with, and collect additional context in- in empirical studies. formation such as study sample sizes and covered subject All four surveys focus on different key characteristics areas. Further, this review is more comprehensive in that (see Table 1, column “Dimensions”); and derive common it does not include only papers that studied the effects on themes and research trends based on these. For instance, the learning effectiveness but any paper that examines Giannakos [3] centers on the development of the field and features of video-based learning materials regardless of shows a shift in the used technologies. Yousef et al. [4] the ultimate goal. For example, papers whose aim is to concentrate on the effectiveness of VBL and provide a propose tools, analyze data from e-learning platforms or review of teaching methods using video, design features, suggest features that should be considered in the design and how video-based contents and tools are integrated. of educational videos.Also, in contrast to Giannakos [3] Poquet et al. [5], spotlight the video characteristics that and Yousef et al. [4], this review deepens into the techno- have been analyzed in research with specific regard to logical trends of VBL by detailing not only the developed their influence on learning effectiveness. As stated by tools but the technological tasks carried out to develop the authors, the most often used metrics to qualify the such tools. Moreover, our review complements previ- effectiveness are recall (remembering information) and ous studies by focusing on the most recent publications transfer (applying what was learned to different scenar- (2020-2021). ios), followed by motivation, cognitive load, mental effort, Inductively, we deduce a taxonomy of researched video attention, and affect. The most analyzed features are text, features. This is used to structure our account of research audio, animations, and the video production style. Ta- in the domain; and will guide the successive extension ble 2 summarizes the features derived in their review. It of our review towards earlier works. is also found that when data sources are available, the most common sources are eye-tracking and click-stream data. A very recent work by Sablic et al. [6] reviews 3. Methodology approaches to support teachers’ self-reflection and pro- This literature review covers studies extracted from three fessional development with video recordings of teaching academic databases: (a) the Digital Bibliography and Li- sessions. brary Project (DBLP), (b) the Association for Computing Giannakos [3] discovers a shift in the focus of research Machinery (ACM), (c) and SpringerLink databases. from social science disciplines to more technological do- These are specialized in the computer science field and, mains. In accordance to these findings, Poquet et al. [5] thus, provide suitable domain specificity not only for our state that more often than other subjects, the reviewed review of technological systems surrounding video-based studies focus on STEM topics. The authors further find learning but also to identify the state of the art in this that these are followed by psychology, humanities, and specific research area. social domains. The identified research trends include We iteratively refined our set of query terms, to include collaborative video-based learning [4], the effect of anno- new vocabulary from the research papers. Whenever we tation and authoring features [4], and the use of videos encountered a new term to describe research around VBL as an instrument for reflective teacher education [6]. in an article, we went back to the databases and re-ran Table 2 4. Video-based Learning Literature review findings of Poquet et al. [5] This section gives a short introduction to video-based Category Features learning, including a discussion of potentials and chal- Presentation Video modality (e.g., text, audio, anima- features tion, and voice over slides) lenges as outlined in the reviewed papers, and motivates Text customization (e.g., personal pro- our specific interest in research works on the analysis nouns) of video characteristics. It lays the foundation for the Signaling (e.g., highlight relevant infor- following discussion of the literature review with a focus mation) on video features. Others Video Usage (e.g., videos instead of face- The task of learning by using video sources has to-face-lectures) adopted the name of video-based learning (VBL). This ter- Content (e.g., metaphors in contrast to minology is used in several studies (e.g., [3, 5, 4, 6]). In the descriptive language) context of this review, the term “video-based learning” is Task (e.g., annotation features) employed in the same way, that is, as gaining knowledge Quizzes Learner characteristics and skills by using video material. This process is mani- Learner Control (e.g., pause, play) fold and involves several actors beyond the learner and Distraction the video such as learning platforms. Consequently, we Dependent Recall (remembering information) consider VBL in its wider context by including features variables Transfer (applying what was learned to of video platforms. different scenarios) Potentials: It has been argued that VBL can be a pow- erful tool to enhance teaching and learning. Yousef et Table 3 al. [4], for instance, state educational videos to be effec- Search queries: Base terms in the first column are joined (us- tive, able to increase motivation, engage the learner, and ing logical AND) with extensions in the second column support diverse learning styles. The authors further un- derline the videos’ potential to convey information that is Base Extension hard to capture in text, e.g., to visualize procedural infor- Video-based Learning Video Education Educational Instructional mation. They cite several studies that present evidence Learn Learning Instruction about videos improving not only the learning outcome Teach Teaching Explaining but also the satisfaction, interaction, and communication Knowing Knowledge Explanatory among learners. Tutorial Student Classroom Videos are already an integral part of students’ learn- Pupil School University ing habits [1] and common teaching practices [5]. Indeed, Skills some even state that in online education, videos might become the main medium [7]. One reason for this devel- opment is certainly availability: Video production has the search with the new terminology. The final set of become much easier [5]; and dedicated platforms pro- query combinations can be found in Table 3. vide simple, scalable ways for dissemination [3, 8]. This The resulting set of papers was filtered based on the allows laypeople and professional teachers to make their following criteria: materials available to the world. Challenges: While offering many proven benefits, 1. Videos are used in academic learning scenarios there are challenges involved in the production and dis- (this excludes tutorials for everyday tasks, such semination of efficient video-based learning resources. as cooking instructions). Guo et al. [8] found in interviews that the video edit- 2. The article considers videos as learning resources ing “was not done with any specific pedagogical “design (this excludes the use of video recordings in re- patterns” in mind” [8, p. 45]. The authors also conclude flective scenarios, such as in teacher education). that the production of videos is based on “decisions on anecdotes, folk wisdom, and best practices distilled from 3. The article examines specific characteristics and studies with at most dozens of subjects and hundreds of features of the video-based learning materials. video watching sessions.” [8, p. 42], which evidences the lack of understanding of how to produce effective videos. Here, we report on a subset of 41 articles, focusing on Furthermore, it is still unclear if techniques from classical very recent publications from the years 2020 and 2021. It classroom scenarios can be directly transferred to VBL is planned to extend the review in the future to cover a environments, or if and to what degree VBL scenarios longer period time. need customized didactic approaches and concepts. Another challenge arises from the question of how a learner can be supported in her video-based learning tra- current insights and research trends. The discussion of jectory. It is widely recognized that learners have diverse learner-related features, such as previous knowledge and needs [8], depending on their current learning objective, skills, is out of the scope of this review (see, for instance, context, and general preferences. How to analyze and [14] for a recent review). index video resources to satisfy these individual require- ments is still an open research question. As we will show below, exploring possible features for the automatic deter- 5. Literature Review mination of video content and quality is one active area This section summarizes our findings from the analysis of research. These explorations are indispensable in the of 41 publications on video-based learning in the pe- quest to allow diverse learners a satisfying video-based riod 2020-2021. We analyze (a) the principal research learning experience. approach (Section 5.1); (b) the target task and the used Lastly, MOOCs (Massive Open Online Courses) and technologies (Section 5.2); (c) the video features explored similar educational courses are a central research area, in the studies (Section 5.3); (d) and, finally, verify former mainly because they have enabled data analysis at scale. surveys’ finding that there is a specific focus on STEM But it is precisely the large-scale nature of MOOCs that subjects in VBL studies (Section 5.4). can carry issues in handling and processing such a mas- sive amount of information. As stated by Guo et al. “The scale of data from MOOC interaction logs hundreds of 5.1. Research Approaches thousands of students from around the world and mil- Four main research approaches have been identified in lions of video watching sessions is four orders of mag- the reviewed literature: (1) Controlled experiments, nitude larger than those available in prior studies” [8, which in this study are defined as experiments where p. 42]. Moreover, critics directed to the research commu- one or a few variables are manipulated (according to the nity have been raised as well. For example, a challenge learning context, video characteristics, or content) and that needs to be confronted in these big scenarios is find- where the impact of this manipulation is measured by ing how to incorporate controlled experiments [4] and some target metric. This group includes experiments run methods or instruments from other fields, such as the in laboratory settings, but also those performed in real- field of psychology, instead of relying only on data anal- istic academic environments such as university courses. ysis [5]. This category accounts for 22 instances (54%) in the re- Importance of features: Research on VBL poses a viewed papers. (2) Novel tools, architectures, or anal- number of relevant questions: How can efficient learning ysis pipelines surrounding video-based learning sce- be facilitated and improved? How to enable long-term narios are covered in 16 (39%) articles. (3) Design prin- learning success? How to ensure the learner’s engage- ciples and guidelines for the creation of educational ment and satisfaction? The answer to these questions videos are presented in two publications (5%). (4) Results strongly relies on one base research question: What char- of the analysis of data generated by users in a video- acterizes an effective learning video? [8, 4, 5] The search based online learning platform are presented in one paper for relevant video features has been widely studied from (2%). the perspective of different research fields (e.g., [9]), and Table 4 lists all the publications in each category. The in an exploration of what is possible with current tech- high number of empirical studies matches Poquet et al.’s nologies (e.g., [10]). In consequence, there is a multitude [5] statement that, especially after 2016, empirical works of variables that have been extracted, studied, and ma- are on the rise. However, an extension of the reviewed pe- nipulated. Video is a complex medium combining audio, riod time is necessary to see if our findings concord with text, and visual information from all of which features the authors’ temporal placement of the change. Within can be extracted and combined. These can range from this category (controlled experiments), 21 out of 22 stud- low-level features like the use of audio quality features ies reported the number of participants in the experiment (e.g.,[11]), to high-level features where semantics try to (sample size). The smallest study reported 12 participants be captured (e.g., [12]). Features can also not necessar- [15] and the biggest acquired a group of 229 learners ily come directly from the video but can result from the [16]. The average number of participants was 85 learn- interaction with a video, e.g., views and likes (e.g., [13]). ers (SD=56). Studies that focused on the development of Additional to the characteristics of the video, it is vital tools (group (2)) or data analysis of learning platforms to study its environment. The success of a VBL process such as MOOCs (group (4)) often do not report a sample is further influenced by functionalities of the surround- size. ing platform, the learner’s context, and the instructor Some of the subsequent sections focus on a subset of captured in the video. the groups since not all dimensions can be meaningfully In our review, we focus on those features directly extracted from all research objectives. The following related to the learning video, with the goal to outline review of used technologies, for instance, only makes a Coursera course (a MOOC provider) [37]. All these sense in the context of works that have a technology recent examples make use of deep learning technolo- component, and will, consequently, mostly cover works gies, specifically, neural networks with gated recurrent of the group (2). units [37], recurrent neural networks (RNN) [38], or long short-term memory neural networks (LSTM) [36]. These Table 4 classifiers have been mainly fed with real-time features Papers according to the research approach [36, 37], especially play, pause, and speed rate, however, aggregated data [37] and results of quizzes have also been Research Approach Publications explored [38]. Controlled Experiments [16, 15, 17, 18, 19, 20, 10, 21, 22, 23, 24, 25, 26, 27, 28, 29, Two of the 16 reviewed papers in this section (13%) 30, 31, 32, 33, 34, 35] focus on video segmentation. Video as a learning Tool, architecture, pipeline [11, 12, 36, 37, 38, 13, 39, 40, medium has the downside that all information is pre- 41, 42, 43, 44, 45, 46, 47, 48] sented sequentially. Unlike text, videos cannot be easily Design principles [49, 50] skimmed to find relevant information. This can be alle- Data analysis [9] viated by automatic video segmentation. In our sample, one paper aims to automatically determine important segments in the educational content using a bidirectional LSTM network classifier built on pre-trained models to 5.2. Target Task and Used Technologies extract text and visual features [11]. Das & Das [12] improve on topical segmentation of lectures videos by This section focuses on the 16 studies which present a identifying concepts in the speech transcripts. To achieve technology component. From these, roughly a third of this, speech-to-text-technologies are used, pre-trained the studies considered in this section (N=5, 31%) present neural network models are to analyze the resulting text, recommender systems that aim to support learners in and the cluster centroid algorithm is used to group the finding suitable learning material. One of the systems resulting concepts. does not only recommend video resources but also learn- The remaining six studies (38%) focus on diverse tasks ing materials in other modalities (e.g., text articles such such as video indexing [46], video ranking [43], video as Wikipedia pages [39]). Another two systems provide customization according to the type of learner [47], pre- the user with a video sequence (playlist) instead of single diction of instructors’ enthusiasm [48], matching videos learning contents [41, 42]. with practical exercises [44] and helping educators with A popular approach to recommend the material is the creation of teaching materials [45]. based on the extraction of topical keywords from the video’s speech transcript [39, 40, 42], especially, by using 5.3. Features the tf-idf algorithm. Those are then used to compute the similarity of the currently viewed video to the other The core focus of this review is the collection of video ones in the database [39, 40]. Tavakoli et al. [13] suggest characteristics that have been studied in the literature not only use the current video for this but all resources (2020 and 2021). The objective is to provide an account previously viewed by the user. They combine this infor- of how video features have been investigated, extracted, mation with a profile of the learner’s skill set and predict and varied in the context of studies on video-based learn- relevant videos using a Random Forest classifier. Another ing in the computer science field. We developed a feature technique is to detect prerequisite relations between the taxonomy that groups the video features into categories, videos to generate sequences organized by difficulty and which allows for a structured presentation of our find- complexity [41, 42]. Lastly, Tang et al. [42] addition- ings. ally analyzes viewer comments with sentiment analysis We identified eight high-level categories: (a) audio fea- to generate a playlist, assuming that positive-sounding tures, (b) visual features, (c) textual features, (d) features comments point to engaging learning material. related to the instructor’s behavior, (e) features result- Other common use cases are systems for the forecast- ing from the interaction of the learner with the video ing of learning success (N=3, 19%). “Success” can be (e.g., likes), (f) interactive features, (g) production style operationalized in different ways: In two studies, the (e.g., Khan style), and (h) features related to instructional objective is to predict the final score of the learner as design principles. All of these will be defined and their “pass” or “fail”. The first case is in the context of passing appearances reviewed in separate paragraphs. a university MOOC course [36], and the second case is Audio features are all features that relate to the ac- about passing video quizzes presented by videos in an tual video’s audio stream and also the features that can e-learning platform [38]. In a third study, the target is be extracted from it. This includes audio records (e.g., to forecast the probability of a student dropping out of [39]), and quality-related features, such as energy, en- Table 5 Features taxonomy Category Sub-category Examples Audio features Audio records [39] Quality-related features [11, 9, 45, 50] Energy, entropy, spectral features Person-related audio features [24, 45, 48, 50] Clarity of instructor’s voice, speech rate, vocabu- lary, attention guiding emphasis Visual features Frames [11, 12, 44] Representation features (directing attention) Animations, realistic visuals, visual cues (e.g., [24, 25, 26, 27, 19, 45] color-coding, highlighting, arrows) Enhanced visuals [23, 20, 10, 21, 22] Augmented/virtual reality, 360-degree features Quality-related features [9] Text Transcripts [11, 12, 13, 39, 40, 41, 42, 43, 44, 46] Metadata [13, 40, 42, 36, 9] Titles, keywords, tags, video length Visual [24, 44] Visual text (text extracted from frames) Textual cues (guiding attention to a key location in a slide) Instructor’s Gestures [16, 25, 48] Beat gestures (e.g., hand strokes, natural hand- behavior waving, and pointing gestures that aim at high- light the speech), facial emotions Body-related [48] Volume (expansion of the teacher’s body), pose Personality [45] Interaction Real-time features [36, 37, 33, 26, 34, 35, 47] Play, pause, rate-change (speed), seek forward, between learners seek backward and the video Percentage of the video that was watched Average proportion of videos watched per week Standard deviation of the proportion of videos watched per week Popularity [13, 9] Number of views, user ratings, relevancy score (rank in search results) Interactive [38, 33, 30, 28, 26, 32, 29, 31] Quizzes, annotation, feedback, exchange (ac- features tions to interact with other learners) Production style [18, 32, 15, 17, 19] Tutorial, lecture, Khan style, talking head, Dia- logue and monologue, voice over slides, only-text, animations included Instructional de- Principles of multimedia learning [9] Coherence, signalling, spatial contiguity, seg- sign principles mentation (information in segments, rather than a long stream), pre-training (present first the basic information), modality (graphics and spo- ken words), multimedia (words together with pic- tures), personalization (informal conversational style), voice (human rather than a computer voice), image (animations) tropy, and spectral features which allow making pre- to text and still image, is that movement can guide the dictions on how well the auditory information can be learner’s attention. These types of features are captured perceived, e.g., [9, 11, 45]. A second big cluster is person- in the category visual representation features where we related audio features which groups characteristics di- can find, for example, highlighting and visual cues (e.g., rectly related to the instructor’s expression [24, 45, 48]. animated and appearing elements) [24, 25, 26, 27, 19, This includes, for instance, metrics for the teacher’s voice 45]. There is also work investigating visual quality fea- [45, 48], speech rate [50], used vocabulary [50], emphasis tures [9]. Finally, some works explore enhanced visual on specific words that guide the listener’s attention [24]. representations by including 360-degree scene represen- The category visual features contains analyses or tations [20, 10, 21], augmented [23], and virtual reality adaptations related to the video’s image information. features [22]. This includes the analysis of information in video frames In the category text features, we group every text- as images [11, 12, 44]. Particular to video, in contrast based information which is available about the video or can be extracted from it. It is common practice to several (dialogue, interview), and in which perspective generate a speech transcript from the spoken content in the content is presented (upfront, or over-the-shoulder a learning video. This pre-processing step transforms view in tutorials). Production style was mentioned sev- hard-to-analyze speech into written text, for which so- eral times as a feature in our sample [18, 32, 15, 17, 19]. phisticated and efficient analysis methods are available. There is a number of psychological and educational Indeed, in our sample, speech transcripts are widely theories on how to efficiently support learners with mul- used [11, 12, 13, 39, 40, 41, 42, 43, 44, 46], and are ref- timedia resources, references to those are collected in the erenced in all the works on video segmentation [11, 12], category instructional design principles. The only and on recommending subsequent videos [36, 37, 38]. theoretical element referenced in our sample is the set Many video platforms provide metadata about the of Multimedia Learning Principles introduced by Mayer hosted videos, which are widely used as an easily avail- [55], which was used by Eradze et al. [9]. They collected able data source. These include structured information manual annotations that state whether a video follows about the video’s title, attributed keywords, tags, and sim- those design principles (see Table 5 for examples). The ilar [13, 40, 42]. Video length has been used, for instance, aim was to find a correlation between the principles and by Mubarak et al. [36] as a factor to predict the learning students’ perceptions regarding the quality of the video. outcome, and by Tavakoli et al. [13] to suggest similar As shown by the taxonomy in Table 5, from all the videos. Finally, text, especially in learning videos, also studies reviewed in this research work (2020-2021), the appears to support and visualize the speakers’ message most explored video features are text-related features, in the form of superimposed scene text, often in the form especially transcripts. The next most investigated are in- of presentation slides (visual text) [24, 44]. teractive features, which study additional functionalities In every learning process, the teacher is a central fa- (quizzes, note-taking) that impact the learning success. cilitator, there is thus a number of features related to In third place, we find real-time features. Other usually instructor behavior [16, 25, 45, 48]. This group cap- explored features are visual features, especially features tures gesturing [16, 25] and facial emotions [48], features that direct attention (e.g., animation), enhanced visuals related to the body such as volume and pose [48], and also (e.g., 360-degree features), and production style (e.g., talk- information about the speaker’s personality [45], which ing head). These findings are similar to the results of have been used as base techniques to determine higher- Poquet et al. [5] who show that text, animation, audio, level information. and production style are the most explored features. Several studies use learner interactions with the Lastly, we found that the impact of the above men- video as a feature. This includes real-time features of tioned features has been measured mainly through the a user interacting with a certain instructional video learning outcome using metrics such as recall and trans- [36, 37, 33, 26, 34, 35, 47], e.g., playing and pausing a fer [16, 33, 30, 23, 24, 25, 28, 26, 27, 10, 32, 15, 29, 22, 34, 17, video, skipping video content or interrupting; and aggre- 35, 19], which, again, conforms to the results of Poquet gated measures, such as the number and ratio of videos et al. [5]. Other frequently used metrics are: cognitive watched in a time period, or the aggregated behavior load [16, 25, 32, 22, 34, 17, 31, 19, 33, 23, 26, 18, 32, 22], of several users on a certain video (e.g., [47]). Besides, motivation [30, 23, 20, 10, 22, 17], self-efficacy [28, 22, 17], user interactions are used as indicators of video popular- mental effort [16, 32, 19], eye-gaze direction [25, 15, 17], ity [13, 9], in form of the number of views and ratings. engagement [20, 26], comprehension (understanding) Interactive functionalities surrounding the [21, 29], and social presence [16, 17]. Of these, moti- video: A critical challenge in videos is that information vation, cognitive load, mental effort, and the results of is transient [51] and consumed in a rather passive eye-tracking were also included in the findings of Po- way [52, 53]. This invites superficial processing quet et al. [5] as often used metrics. On the other hand, and leads to challenges in knowledge acquisition the less frequently used metrics encompass: enjoyment and integration. In consequence, several studies [10], agent-persona (credibility, engagement, and learn- investigate functionalities to actively involve the ing facilitating) [16], affective rating [16], para-social user [38, 33, 30, 28, 26, 32, 29, 31] by adding functionali- interaction [16], self-regulated learning [33], sense of ties such as interactive quizzes, collaborative elements, presence (feeling of being present in the environment) or note-taking areas. [10], perceived benefits [21], confidence [19], creative The production style category classifies character- thinking [22], application [21], analysis [21], synthesis istic types of video design [54]. It includes rough sub- [21], evaluation [21], attention span [32], views about categories of how the learning setting is composed – e.g., the teaching technique [27], facial temperature [19], and if people are visible (talking-head video, lecture setting challenge [19]. with or without an audience) or not (voice-over-slides, Khan-style lecture, animations, and films of real-world phenomena). If there is a single speaker (monologue) or 5.4. Covered Subject Domain to improve video segmentation for educational videos; the approaches widely rely on pre-trained models for the Previous literature reviews stated that STEM (Science, extraction of textual and visual features. Technology, Engineering & Mathematics) areas are over- Video features: In general, the most explored video represented in the investigation of VBL [3, 5]. This is features are text-related, especially metadata and speech confirmed in our sample of research papers from 2020- transcripts. The second most studied features are inter- 2021. In our first category of controlled experiments, 14 active (such as interactive quizzes and other additional of 22 studies (64%) use videos on some STEM domain. functionalities), followed by click-stream features, visual Specifically, computer science [28, 20, 17, 35] and physics features, and production style. [30, 18] are often targeted. In the remaining eight pa- Specifically, in the controlled experiments category, pers, teaching English as a foreign language is the most the main features manipulated are related to interactive common discipline [23, 29], followed by other social and features (32% of the studies), principally quizzes and an- humanities domains. notations. Other often explored features in this category In the research categories “Data analysis” and “Design are enhanced visual features, mainly 360-degree features, principles & guidelines” the main focus is also on STEM production style features (18% of the studies) with an spe- video content. However, the small sample size in these cial emphasis in dialogue and monologue, and features categories does not lead to further interpretation. Papers that direct the attention of the learner (18% of the studies) in the “Tools” research category do not usually focus on (e.g., animations.) specific disciplines but aim at offering general solutions Subject domain: Based on our sample, we found that for VBL. This is why no specific analysis regarding the STEM domains are prevalent in the targeted subject areas. subject domain is provided. Tendencies and directions: In comparison to previ- ous reviews, the tendencies that could be reaffirmed in 6. Conclusions this study are mainly related to the type of research that was performed, the type of features that have been manip- In this paper, we have presented a survey on 41 papers ulated, and the subject domain in which the educational in the field of video-based learning (2020-2021) to pro- videos have focused. Specifically, the trends discovered vide a structured account of current tendencies in the by Poquet et al. [5] could be confirmed by our review: field. Specifically, we identified (a) common research (1) It was observed that the research community-directed approaches, (b) target applications and the used tech- effort principally towards controlled experiments (54% nologies, (c) investigated video features and developed of the reviewed papers). (2) Text, animation, audio, and a taxonomy which allows structuring the related work, production style are situated among the most explored (d) and, finally, collected information on the covered sub- features. (3) Recall and transfer are the most-used metrics ject domains of the learning environments. to measure learning outcome and effectiveness. (4) Cog- Research approaches: The results show that more nitive load, mental effort, motivation, and the results than half of the studies are dedicated to controlled ex- of eye-tracking are considered among the most popular periments (54% of the studies). A significant proportion metrics for measuring learning experience. (5) Target dis- focused on tool development (39% of the studies), while ciplines of the videos were primarily addressing STEM a very small proportion of studies dedicated effort to topics analyze data from learning platforms such as MOOCs Research on video, text, and image analysis has been (2% of the studies) and proposed design principles and dominated by deep learning approaches in recent years. guidelines (5% of the studies). Increasingly, these also find application in the analysis of Tools and technologies: The main applications in educational data, exemplified by their dominance in learn- our sample are three: (1) recommender systems (35% ing outcome prediction, educational video segmentation, of the studies in the “Tools” category), (2) predictors and feature extraction. In most cases, the developed sys- of learning outcome (19%), and (3) video segmentation tems make use of general-purpose pre-trained models, techniques (13%). Technology-wise, keyword-based anal- even though first examples of models specifically trained yses (e.g., using tf-idf to detect pertinent keywords) are on educational data exist, e.g., EduBERT [56], a word the most commonly adopted. Approaches that aim to embedding trained on educational materials. It will be provide video playlists (as opposed to singular video rec- exciting to see the potential of deep learning techniques ommendations) often refer to techniques from the area properly adapted to educational media, and how they of prerequisite detection. Recent approaches to forecast- will enhance educational applications. ing learning success mainly build upon deep learning There are various studies that investigate learning on techniques (e.g., multilayer perceptrons, gated recurrent the web, mainly focusing on features of user behavior units, RNNs, and LSTMs). A similar tendency towards [57] or textual materials. Only recent publications such as deep learning approaches can be seen in methods aiming [58] explore the role of videos in such web-based learning processes. However, as shown here, the analysis of video [2] J. Clement, Hours of video uploaded to as a learning resource is a highly active research area. youtube every minute, Statista.com (2019). URL: The related work outlined here can be a guide for future https://www.statista.com/statistics/259477/hours- works which aim to integrate video-based resources into of-video-uploaded-to-youtube-every-minute/. individualized online-learning trajectories, and will pro- [3] M. N. Giannakos, Exploring the video-based learn- vide interested scientists with a starting point for their ing research: A review of the literature, Br. J. Educ. research. Technol. 44 (2013) 191. URL: https://doi.org/10.1111/ The reviewed papers investigated a wide range of video bjet.12070. doi:10.1111/bjet.12070. features, including information from the visual content, [4] A. M. F. Yousef, M. A. Chatti, U. Schroeder, The be it textual or image, and audio information. However, state of video-based learning: A review and future the modalities were mainly considered separately, with- perspectives, Int. J. Adv. Life Sci 6 (2014) 122–135. out regard to their interactions. Research on VBL effi- [5] O. Poquet, L. Lim, N. Mirriahi, S. Dawson, Video ciency could be brought forward by a truly multi-modal and learning: a systematic review (2007-2017), in: analysis of educational videos, which explores how spo- A. Pardo, K. Bartimote-Aufflick, G. Lynch, S. B. ken language, shown text and image, and speaker be- Shum, R. Ferguson, A. Merceron, X. Ochoa (Eds.), havior exhibit a combined message as, for instance, ex- Proceedings of the 8th International Conference on plored by Shi et al. [59]. This is in line with psychological Learning Analytics and Knowledge, LAK 2018, Syd- research on instructional design, and might bring new ney, NSW, Australia, March 07-09, 2018, ACM, 2018, insights for educational research in quantifying formal pp. 151–160. URL: https://doi.org/10.1145/3170358. relationships between image, text and speech in their 3170376. doi:10.1145/3170358.3170376. impact on learning success. [6] M. Sablić, A. Mirosavljević, A. Škugor, Video- Limitations and future work: This review, with a based learning (vbl)—past, present and future: An focus on publications from the years 2020 and 2021, only overview of the research published from 2008 to considers a short period time. It is planned to extend the 2019, Technology, Knowledge and Learning (2020) study in the future, to provide a more thorough analysis 1–17. of tendencies and directions in VBL. Moreover, although [7] A. Hansch, L. Hillers, K. McConachie, C. Newman, the keyword set used in the paper retrieval process is T. Schildhauer, J. P. Schmidt, Video and online extensive, we plan to broaden this set to ensure a more learning: Critical reflections and findings from the comprehensive literature review. As pointed out by the field (2015). reviewers of this paper, there is a surprisingly low num- [8] P. J. Guo, J. Kim, R. Rubin, How video production ber of MOOC-related studies included in our dataset. This affects student engagement: an empirical study of will be considered in the extension of our keyword set. MOOC videos, in: M. Sahami, A. Fox, M. A. Hearst, M. T. H. Chi (Eds.), First (2014) ACM Conference on Learning @ Scale, L@S 2014, Atlanta, GA, USA, Acknowledgments March 4-5, 2014, ACM, 2014, pp. 41–50. URL: https: //doi.org/10.1145/2556325.2566239. doi:10.1145/ This work has been partly supported by the Ministry 2556325.2566239. of Science and Education of Lower Saxony, Germany, [9] M. Eradze, A. Dipace, B. Fazlagic, A. D. Pietro, Semi- through the Graduate training network “LernMINT: automated student feedback and theory-driven Data-assisted classroom teaching in the MINT subjects”. video-analytics: An exploratory study on educa- Also, this study has been partly supported by the Leib- tional value of videos, in: L. S. Agrati, D. Bur- niz Association, Germany (Leibniz Competition 2018, gos, P. Ducange, P. Limone, L. Perla, P. Picerno, funding line “Collaborative Excellence”, project SALIENT P. Raviolo, C. M. Stracke (Eds.), Bridges and Me- [K68/2017]). Finally, we would like to thank the reviewers diation in Higher Distance Education - Second In- for their valuable feedback. ternational Workshop, HELMeTO 2020, Bari, BA, Italy, September 17-18, 2020, Revised Selected Pa- References pers, volume 1344 of Communications in Computer and Information Science, Springer, 2020, pp. 28–39. [1] A. Smith, S. Toor, P. Van Kessel, Many turn to URL: https://doi.org/10.1007/978-3-030-67435-9_3. youtube for children’s content, news, how-to doi:10.1007/978-3-030-67435-9\_3. lessons, Pew Research Centre 7 (2018). URL: [10] P. Araiza-Alba, T. Keane, B. Matthews, K. Simp- https://www.pewresearch.org/internet/2018/11/ son, G. Strugnell, W. S. Chen, J. Kaufman, The 07/many-turn-to-youtube-for-childrens-content- potential of 360-degree virtual reality videos to news-how-to-lessons/. teach water-safety skills to children, Comput. Educ. 163 (2021) 104096. URL: https://doi.org/10.1016/j. compedu.2020.104096. doi:10.1016/j.compedu. on Human Factors in Computing Systems, Hon- 2020.104096. olulu, HI, USA, April 25-30, 2020, ACM, 2020, pp. [11] J. A. Ghauri, S. Hakimov, R. Ewerth, Classifica- 1–12. URL: https://doi.org/10.1145/3313831.3376845. tion of important segments in educational videos doi:10.1145/3313831.3376845. using multimodal features, in: S. Conrad, I. Tiddi [18] A. Pérez-Navarro, V. Garcia, J. Conesa, Students per- (Eds.), Proceedings of the CIKM 2020 Workshops ception of videos in introductory physics courses co-located with 29th ACM International Confer- of engineering in face-to-face and online environ- ence on Information and Knowledge Management ments, Multim. Tools Appl. 80 (2021) 1009–1028. (CIKM 2020), Galway, Ireland, October 19-23, 2020, URL: https://doi.org/10.1007/s11042-020-09665-0. volume 2699 of CEUR Workshop Proceedings, CEUR- doi:10.1007/s11042-020-09665-0. WS.org, 2020. URL: http://ceur-ws.org/Vol-2699/ [19] N. Srivastava, S. Nawaz, J. M. Lodge, E. Velloso, S. M. paper15.pdf. Erfani, J. Bailey, Exploring the usage of thermal [12] A. Das, P. P. Das, Incorporating domain knowl- imaging for understanding video lecture designs edge to improve topic segmentation of long and students’ experiences, in: C. Rensing, H. Drach- MOOC lecture videos, CoRR abs/2012.07589 sler (Eds.), LAK ’20: 10th International Conference (2020). URL: https://arxiv.org/abs/2012.07589. on Learning Analytics and Knowledge, Frankfurt, arXiv:2012.07589. Germany, March 23-27, 2020, ACM, 2020, pp. 250– [13] M. Tavakoli, S. Hakimov, R. Ewerth, G. Kismi- 259. URL: https://doi.org/10.1145/3375462.3375514. hók, A recommender system for open educa- doi:10.1145/3375462.3375514. tional videos based on skill requirements, in: [20] J. C. Muñoz-Carpio, M. A. Cowling, J. R. Birt, Doc- 20th IEEE International Conference on Advanced toral colloquium - exploring the benefits of us- Learning Technologies, ICALT 2020, Tartu, Esto- ing 3600 video immersion to enhance motivation nia, July 6-9, 2020, IEEE, 2020, pp. 1–5. URL: https: and engagement in system modelling education, //doi.org/10.1109/ICALT49669.2020.00008. doi:10. in: D. Economou, A. Klippel, H. Dodds, A. Peña- 1109/ICALT49669.2020.00008. Ríos, M. J. W. Lee, D. Beck, J. Pirker, A. Dengel, [14] A. Abyaa, M. Khalidi Idrissi, S. Bennani, Learner T. M. Peres, J. Richter (Eds.), 6th International Con- modelling: systematic review of the literature ference of the Immersive Learning Research Net- from the last 5 years, Educational Technology work, iLRN 2020, San Luis Obispo, CA, USA, June Research and Development 67 (2019) 1105–1143. 21-25, 2020, IEEE, 2020, pp. 403–406. URL: https: doi:10.1007/s11423-018-09644-1. //doi.org/10.23919/iLRN47897.2020.9155100. doi:10. [15] A. Nugraha, I. A. Wahono, J. Zhanghe, T. Harada, 23919/iLRN47897.2020.9155100. T. Inoue, Creating dialogue between a tutee agent [21] W. Daher, H. Sleem, Middle school students’ learn- and a tutor in a lecture video improves students’ ing of social studies in the video and 360-degree attention, in: A. Nolte, C. Alvarez, R. Hishiyama, videos contexts, IEEE Access 9 (2021) 78774–78783. I. Chounta, M. J. Rodríguez-Triana, T. Inoue (Eds.), URL: https://doi.org/10.1109/ACCESS.2021.3083924. Collaboration Technologies and Social Computing doi:10.1109/ACCESS.2021.3083924. - 26th International Conference, CollabTech 2020, [22] H. Huang, G. Hwang, C. Chang, Learning to be Tartu, Estonia, September 8-11, 2020, Proceedings, a writer: A spherical video-based virtual reality volume 12324 of Lecture Notes in Computer Science, approach to supporting descriptive article writing Springer, 2020, pp. 96–111. URL: https://doi.org/10. in high school chinese courses, Br. J. Educ. Technol. 1007/978-3-030-58157-2_7. doi:10.1007/978-3- 51 (2020) 1386–1405. URL: https://doi.org/10.1111/ 030-58157-2\_7. bjet.12893. doi:10.1111/bjet.12893. [16] M. Beege, M. Ninaus, S. Schneider, S. Nebel, [23] C.-H. Chen, Ar videos as scaffolding to foster stu- J. Schlemmel, J. Weidenmüller, K. Moeller, G. D. Rey, dents’ learning achievements and motivation in efl Investigating the effects of beat and deictic gestures learning, British Journal of Educational Technology of a lecturer in educational videos, Comput. Educ. 51 (2020) 657–672. 156 (2020) 103955. URL: https://doi.org/10.1016/j. [24] X. Wang, L. Lin, M. Han, J. M. Spector, Im- compedu.2020.103955. doi:10.1016/j.compedu. pacts of cues on learning: Using eye-tracking tech- 2020.103955. nologies to examine the functions and designs of [17] B. Lee, K. Muldner, Instructional video design: In- added cues in short instructional videos, Com- vestigating the impact of monologue- and dialogue- put. Hum. Behav. 107 (2020) 106279. URL: https: style presentations, in: R. Bernhaupt, F. F. Mueller, //doi.org/10.1016/j.chb.2020.106279. doi:10.1016/ D. Verweij, J. Andres, J. McGrenere, A. Cockburn, j.chb.2020.106279. I. Avellino, A. Goguey, P. Bjøn, S. Zhao, B. P. Sam- [25] J. Moon, J. Ryu, The effects of social and cognitive son, R. Kocielnik (Eds.), CHI ’20: CHI Conference cues on learning comprehension, eye-gaze pattern, 158 (2020) 104000. URL: https://doi.org/10.1016/j. and cognitive load in video instruction, J. Comput. compedu.2020.104000. doi:10.1016/j.compedu. High. Educ. 33 (2021) 39–63. URL: https://doi.org/10. 2020.104000. 1007/s12528-020-09255-x. doi:10.1007/s12528- [34] N. Garrett, Segmentation’s failure to improve soft- 020-09255-x. ware video tutorials, Br. J. Educ. Technol. 52 (2021) [26] A. Cookson, D. Kim, T. Hartsell, Enhancing stu- 318–336. URL: https://doi.org/10.1111/bjet.13000. dent achievement, engagement, and satisfaction doi:10.1111/bjet.13000. using animated instructional videos, Int. J. Inf. [35] D. Lang, G. Chen, K. Mirzaei, A. Paepcke, Is faster Commun. Technol. Educ. 16 (2020) 113–125. URL: better?: a study of video playback speed, in: C. Rens- https://doi.org/10.4018/IJICTE.2020070108. doi:10. ing, H. Drachsler (Eds.), LAK ’20: 10th International 4018/IJICTE.2020070108. Conference on Learning Analytics and Knowledge, [27] M. A. Al-Khateeb, A. M. Alduwairi, Effect of Frankfurt, Germany, March 23-27, 2020, ACM, 2020, teaching geometry by slow-motion videos on the pp. 260–269. URL: https://doi.org/10.1145/3375462. 8th graders’ achievement, Int. J. Interact. Mob. 3375466. doi:10.1145/3375462.3375466. Technol. 14 (2020) 57–67. URL: https://www.online- [36] A. A. Mubarak, H. Cao, S. A. M. Ahmed, Predictive journals.org/index.php/i-jim/article/view/12985. learning analytics using deep learning model in [28] M. C. Sözeri, S. B. Kert, Ineffectiveness of online in- moocs’ courses videos, Educ. Inf. Technol. 26 (2021) teractive video content developed for programming 371–392. URL: https://doi.org/10.1007/s10639-020- education, Int. J. Comput. Sci. Educ. Sch. 4 (2021) 10273-6. doi:10.1007/s10639-020-10273-6. 49–69. URL: https://doi.org/10.21585/ijcses.v4i3.99. [37] B. Jeon, N. Park, Dropout prediction over weeks doi:10.21585/ijcses.v4i3.99. in moocs by learning representations of clicks and [29] A. A. Kuhail, M. S. Aqel, Interactive digital videos videos, CoRR abs/2002.01955 (2020). URL: https: and their impact on sixth graders’ english read- //arxiv.org/abs/2002.01955. arXiv:2002.01955. ing and vocabulary skills and retention, Int. J. [38] H. E. Aouifi, Y. Es-Saady, M. E. Hajji, M. Mimis, Inf. Commun. Technol. Educ. 16 (2020) 42–56. URL: H. Douzi, Toward student classification in educa- https://doi.org/10.4018/IJICTE.2020070104. doi:10. tional video courses using knowledge tracing, in: 4018/IJICTE.2020070104. M. Fakir, M. Baslam, R. E. Ayachi (Eds.), Business [30] D. Leisner, C. G. Zahn, A. Ruf, A. A. P. Cattaneo, Dif- Intelligence - 6th International Conference, CBI ferent ways of interacting with videos during learn- 2021, Beni Mellal, Morocco, May 27-29, 2021, Pro- ing in secondary physics lessons, in: C. Stephanidis, ceedings, volume 416 of Lecture Notes in Business M. Antona (Eds.), HCI International 2020 - Posters Information Processing, Springer, 2021, pp. 73–82. - 22nd International Conference, HCII 2020, Copen- URL: https://doi.org/10.1007/978-3-030-76508-8_6. hagen, Denmark, July 19-24, 2020, Proceedings, Part doi:10.1007/978-3-030-76508-8\_6. II, volume 1225 of Communications in Computer [39] C. Schulten, S. Manske, A. Langner-Thiele, H. U. and Information Science, Springer, 2020, pp. 284– Hoppe, Digital value-adding chains in vocational 291. URL: https://doi.org/10.1007/978-3-030-50729- education: Automatic keyword extraction from 9_40. doi:10.1007/978-3-030-50729-9\_40. learning videos to provide learning resource recom- [31] S. Chen, D. Wang, Y. Huang, Exploring the comple- mendations, in: C. Alario-Hoyos, M. J. Rodríguez- mentary features of audio and text notes for video- Triana, M. Scheffel, I. A. Sánchez, S. Dennerlein based learning in mobile settings, in: Extended (Eds.), Addressing Global Challenges and Quality Abstracts of the 2021 CHI Conference on Human Education - 15th European Conference on Tech- Factors in Computing Systems, 2021, pp. 1–7. nology Enhanced Learning, EC-TEL 2020, Heidel- [32] X. Lu, Q. Li, X. Wang, Research on the impacts berg, Germany, September 14-18, 2020, Proceedings, of feedback in instructional videos on college stu- volume 12315 of Lecture Notes in Computer Science, dents’ attention and learning effects, in: W. Shen, Springer, 2020, pp. 15–29. URL: https://doi.org/10. J. A. Barthès, J. Luo, Y. Shi, J. Zhang (Eds.), 24th 1007/978-3-030-57717-9_2. doi:10.1007/978-3- IEEE International Conference on Computer Sup- 030-57717-9\_2. ported Cooperative Work in Design, CSCWD 2021, [40] J. Jordán, S. Valero, C. Turró, V. J. Botti, Recom- Dalian, China, May 5-7, 2021, IEEE, 2021, pp. 513– mending learning videos for moocs and flipped 516. URL: https://doi.org/10.1109/CSCWD49262. classrooms, in: Y. Demazeau, T. Holvoet, J. M. 2021.9437774. doi:10.1109/CSCWD49262.2021. Corchado, S. Costantini (Eds.), Advances in Practi- 9437774. cal Applications of Agents, Multi-Agent Systems, [33] D. C. D. van Alten, C. Phielix, J. Janssen, L. Kester, and Trustworthiness. The PAAMS Collection - 18th Self-regulated learning support in flipped learning International Conference, PAAMS 2020, L’Aquila, videos enhances learning outcomes, Comput. Educ. Italy, October 7-9, 2020, Proceedings, volume 12092 of Lecture Notes in Computer Science, Springer, cational videos hierarchical indexing with ebooks, 2020, pp. 146–157. URL: https://doi.org/10.1007/ in: H. Mitsuhara, Y. Goda, Y. Ohashi, M. M. T. Ro- 978-3-030-49778-1_12. doi:10.1007/978-3-030- drigo, J. Shen, N. Venkatarayalu, G. Wong, M. Ya- 49778-1\_12. mada, C. Lei (Eds.), IEEE International Conference [41] M. C. Aytekin, S. Räbiger, Y. Saygin, Discov- on Teaching, Assessment, and Learning for Engi- ering the prerequisite relationships among in- neering, TALE 2020, Takamatsu, Japan, December structional videos from subtitles, in: A. N. 8-11, 2020, IEEE, 2020, pp. 482–489. URL: https: Rafferty, J. Whitehill, C. Romero, V. Cavalli- //doi.org/10.1109/TALE48869.2020.9368461. doi:10. Sforza (Eds.), Proceedings of the 13th Interna- 1109/TALE48869.2020.9368461. tional Conference on Educational Data Mining, [47] S. Lallé, C. Conati, A data-driven student model EDM 2020, Fully virtual conference, July 10-13, to provide adaptive support during video watching 2020, International Educational Data Mining Soci- across moocs, in: I. I. Bittencourt, M. Cukurova, ety, 2020. URL: https://educationaldatamining.org/ K. Muldner, R. Luckin, E. Millán (Eds.), Artificial files/conferences/EDM2020/papers/paper_99.pdf. Intelligence in Education - 21st International Con- [42] C. Tang, J. Liao, H. Wang, C. Sung, W. Lin, Con- ference, AIED 2020, Ifrane, Morocco, July 6-10, 2020, ceptguide: Supporting online video learning with Proceedings, Part I, volume 12163 of Lecture Notes concept map-based recommendation of learning in Computer Science, Springer, 2020, pp. 282–295. path, in: J. Leskovec, M. Grobelnik, M. Najork, URL: https://doi.org/10.1007/978-3-030-52237-7_23. J. Tang, L. Zia (Eds.), WWW ’21: The Web Con- doi:10.1007/978-3-030-52237-7\_23. ference 2021, Virtual Event / Ljubljana, Slovenia, [48] Y. Chen, C. Wang, Z. Jian, Research on evaluation April 19-23, 2021, ACM / IW3C2, 2021, pp. 2757– algorithm of teacher’s teaching enthusiasm based 2768. URL: https://doi.org/10.1145/3442381.3449808. on video, in: ICRAI 2020: 6th International Confer- doi:10.1145/3442381.3449808. ence on Robotics and Artificial Intelligence, Singa- [43] D. I. Bleoanca, S. Heras, J. Palanca, V. Julián, M. C. pore, November 20-22, 2020, ACM, 2020, pp. 184– Mihaescu, LSI based mechanism for educational 191. URL: https://doi.org/10.1145/3449301.3449333. videos retrieval by transcripts processing, in: doi:10.1145/3449301.3449333. C. Analide, P. Novais, D. Camacho, H. Yin (Eds.), [49] T. Weinert, M. T. de Gafenco, M. S. Billert, Intelligent Data Engineering and Automated Learn- N. Boerner, Fostering interaction in higher ing - IDEAL 2020 - 21st International Conference, education with deliberate design of interactive Guimaraes, Portugal, November 4-6, 2020, Pro- learning videos, in: J. F. George, S. Paul, ceedings, Part I, volume 12489 of Lecture Notes R. De’, E. Karahanna, S. Sarker, G. Oestreicher- in Computer Science, Springer, 2020, pp. 88–100. Singer (Eds.), Proceedings of the 41st Interna- URL: https://doi.org/10.1007/978-3-030-62362-3_9. tional Conference on Information Systems, ICIS doi:10.1007/978-3-030-62362-3\_9. 2020, Making Digital Inclusive: Blending the Lo- [44] X. Wang, W. Huang, Q. Liu, Y. Yin, Z. Huang, cak and the Global, Hyderabad, India, December L. Wu, J. Ma, X. Wang, Fine-grained similarity 13-16, 2020, Association for Information Systems, measurement between educational videos and ex- 2020. URL: https://aisel.aisnet.org/icis2020/digital_ ercises, in: C. W. Chen, R. Cucchiara, X. Hua, learning_env/digital_learning_env/12. G. Qi, E. Ricci, Z. Zhang, R. Zimmermann (Eds.), [50] J. Ge, X. Li, Design strategies of EFL learning MM ’20: The 28th ACM International Confer- videos: Exampled by a china MOOC, in: ICEIT 2020, ence on Multimedia, Virtual Event / Seattle, WA, Proceedings of the 9th International Conference USA, October 12-16, 2020, ACM, 2020, pp. 331– on Educational and Information Technology, Ox- 339. URL: https://doi.org/10.1145/3394171.3413783. ford, UK, February11-13, 2020, ACM, 2020, pp. 68– doi:10.1145/3394171.3413783. 71. URL: https://doi.org/10.1145/3383923.3383927. [45] A. Hartholt, A. Reilly, E. Fast, S. Mozgai, Introduc- doi:10.1145/3383923.3383927. ing canvas: Combining nonverbal behavior gener- [51] R. E. Mayer, C. Pilegard, Principles for managing ation with user-generated content to rapidly cre- essential processing in multimedia learning: Seg- ate educational videos, in: S. Marsella, R. Jack, menting, pretraining, and modality principles, The H. H. Vilhjálmsson, P. Sequeira, E. S. Cross (Eds.), Cambridge handbook of multimedia learning (2005) IVA ’20: ACM International Conference on In- 169–182. telligent Virtual Agents, Virtual Event, Scotland, [52] R. Ramachandran, E. M. Sparck, M. Levis-Fitzgerald, UK, October 20-22, 2020, ACM, 2020, pp. 25:1– Investigating the effectiveness of using application- 25:3. URL: https://doi.org/10.1145/3383652.3423880. based science education videos in a general chem- doi:10.1145/3383652.3423880. istry lecture course, J. Chem. Educ. 96 (2019) [46] S. Horovitz, Y. Ohayon, Boocture: Automatic edu- 479–485. URL: https://doi.org/10.1021/acs.jchemed. 8b00777. doi:10.1021/acs.jchemed.8b00777. [53] G. Salomon, Television is "easy" and print is "tough": The differential investment of mental effort in learning as a function of perceptions and attributions., Journal of Educational Psychol- ogy 76 (1984) 647–658. doi:https://doi.org/10. 1037/0022-0663.76.4.647. [54] K. Chorianopoulos, A taxonomy of asynchronous instructional video styles, The International Review of Research in Open and Distributed Learning 19 (2018) 294–311. [55] R. Mayer, R. E. Mayer, The Cambridge handbook of multimedia learning, Cambridge university press, 2005. [56] B. Clavié, K. Gal, Edubert: Pretrained deep language models for learning analytics, CoRR abs/1912.00690 (2019). URL: http://arxiv.org/abs/ 1912.00690. arXiv:1912.00690. [57] U. Gadiraju, R. Yu, S. Dietze, P. Holtz, Analyzing knowledge gain of users in informational search sessions on the web, in: C. Shah, N. J. Belkin, K. Byström, J. Huang, F. Scholer (Eds.), Proceed- ings of the 2018 Conference on Human Informa- tion Interaction and Retrieval, CHIIR 2018, New Brunswick, NJ, USA, March 11-15, 2018, ACM, 2018, pp. 2–11. URL: https://doi.org/10.1145/3176349. 3176381. doi:10.1145/3176349.3176381. [58] C. Otto, R. Yu, G. Pardi, J. von Hoyer, M. Rokicki, A. Hoppe, P. Holtz, Y. Kammerer, S. Dietze, R. Ew- erth, Predicting knowledge gain during web search based on multimedia resource consumption, in: I. Roll, D. S. McNamara, S. A. Sosnovsky, R. Luckin, V. Dimitrova (Eds.), Artificial Intelligence in Educa- tion - 22nd International Conference, AIED 2021, Utrecht, The Netherlands, June 14-18, 2021, Pro- ceedings, Part I, volume 12748 of Lecture Notes in Computer Science, Springer, 2021, pp. 318–330. URL: https://doi.org/10.1007/978-3-030-78292-4_26. doi:10.1007/978-3-030-78292-4\_26. [59] J. Shi, C. Otto, A. Hoppe, P. Holtz, R. Ewerth, In- vestigating correlations of automatically extracted multimodal features and lecture video quality, in: Proceedings of the 1st International Workshop on Search as Learning with Multimedia Information, SALMM ’19, Association for Computing Machin- ery, New York, NY, USA, 2019, p. 11–19. URL: https: //doi.org/10.1145/3347451.3356731. doi:10.1145/ 3347451.3356731.