=Paper= {{Paper |id=Vol-1518/paper9 |storemode=property |title=Discovering Learning Antecedents in Learning Analytics Literature |pdfUrl=https://ceur-ws.org/Vol-1518/paper9.pdf |volume=Vol-1518 |dblpUrl=https://dblp.org/rec/conf/lak/KobayashiMK15 }} ==Discovering Learning Antecedents in Learning Analytics Literature== https://ceur-ws.org/Vol-1518/paper9.pdf
  Discovering Learning Antecedents in Learning Analytics
                        Literature
         Vladimer Kobayashi                                     Stefan Mol                              Gábor Kismihók
         CJKR, HRM-OB, ABS                                CJKR, HRM-OB, ABS                           CJKR, HRM-OB, ABS
        University of Amsterdam                          University of Amsterdam                     University of Amsterdam
              Netherlands                                      Netherlands                                 Netherlands
        V.Kobayashi@uva.nl                                  S.T.Mol@uva.nl                           G.Kismihok@uva.nl



ABSTRACT                                                                data that valuate the determinants for student learning success is a
We investigated various learning antecedents that have been the         major concern.
research subjects of Learning Analytics (LA) studies and explored       Our primary objective was to explore the content and quantity of
the content and quantity of the LA literature with respect to each      LA literature that report each learning antecedent. In a parallel
antecedent through text mining the LAK dataset. Our goal was to         manner, we shifted the focus towards the antecedents by finding
simultaneously reveal to what extent do LA researchers address          which antecedents are often addressed and which not. This
learning antecedents and how they incorporated these in the             approach would facilitate a more objective assessment and
implementation of LA solutions (e.g. models and software                comparison of whether LA studies have achieved their intended
technologies) to facilitate and augment student learning. Instead of    outcomes.
taking a pure text mining approach, we undertook a slightly
different strategy by (i) identifying antecedents of student learning   For this study we used the dataset provided by the LAK dataset
by examining extant literature on learning and educational theories     challenge [9] and other literature on student learning theories to
and (ii) identifying which among the theoretically relevant             accomplish our objective.
antecedents are currently reported in LA studies. The analytical
techniques we employed were a mix of domain-based analysis and          2. METHODOLOGY
corpus analytics which included association analysis and key-           As an overview, we used a text mining approach to discover
phrase extraction. The results showed that most LA studies are          learning antecedents. Although text mining is naturally an
geared toward capturing and measuring student awareness and             inductive approach we supplemented our investigation with
promoting social learning and less on goal-setting and self-efficacy.   domain information. The diagrammatic description of the steps we
Through this work we hope to encourage the LA community to              undertook is illustrated in Figure 1.
dedicate research efforts to also investigate other relatively
neglected yet promising learning antecedents.

Keywords
student learning, corpus analytics, learning analytics

1. MOTIVATION AND OBJECTIVE
The Learning Analytics (LA) field uses analytics to understand and
facilitate student learning. Since learning is influenced by various
antecedents and circumstances, some LA researchers focus on
capturing, measuring, and enhancing these antecedents in an effort
to impact student learning. This is especially relevant nowadays
with the proliferation of nontraditional venues for learning such as
                                                                        Figure 1: Diagrammatic view of the methodological steps followed
in online learning. Examples of these antecedents include
                                                                        in this study.
awareness, social learning, and self-regulated learning to name but
a few.                                                                  2.1 Domain-based Analysis
As LA studies flourish a need arises to address the question of how     We first performed an inquiry regarding the antecedents that
LA as a field has contributed so far to our understanding and to the    influence student learning and learning outcomes. From this
enhancement of student learning. This can be answered in part by        inquiry we identified keywords that are usually strongly associated
characterizing LA studies according to which learning antecedents       to each antecedent. The keywords represent the vocabulary used to
they tackle. This could help researchers from various education-        refer to the antecedents that were extracted from existing literature
related disciplines to keep track, compare, and share knowledge and     on education and student learning theories. The antecedents are
to identify opportunities for further research. It could also provide   discussed in Section 3.
a basis for the adaption of LA projects and explicating how LA          The list of keywords were further expanded by using a lexical
models and software technologies influence learning. How each           database called WordNet1 to find semantically similar words. This
element of an LA project imparts information or generates and uses      is a vital step because authors use varying terms to convey the same


1 http://wordnet.princeton.edu/
concept. An example would be to use “participate” rather than           further. This problem can be addressed by using the information on
“engage”. The expanded keyword list was used in the succeeding          the raw frequencies of the term. The higher the raw frequency the
steps.                                                                  more importance we can attach to it with respect to a particular
                                                                        paper.
2.2 Corpus Analytics on LAK dataset
Corpus analytics was performed in the following manner.                 3. NINE ANTECEDENTS OF STUDENT
First, we initially kept matters simple yet meaningful by choosing      LEARNING
to perform corpus analytics only on the abstracts of each               The keywords represent 9 common antecedents that have been
publication. There might be a downside to this such as missing          reported by educational experts as antecedents for success in
otherwise important information but in exchange this has kept the       learning. The antecedents are: (1) Engagement, (2) Motivation, (3)
analysis manageable. Moreover, this decision is sufficient for our      Self-reflection (including self-assessment and self-regulation), (4)
purpose since the abstract contains the gist of the whole article and   Social Learning (among students and between students and
provides a summary about the paper’s objectives, methodology,           teachers), (5) Assessment (e.g. formatting testing and evaluation),
and conclusion.                                                         (6) Recommendation (and feedback), (7) Goal-setting, (8)
Second, we created a corpus containing abstracts of all papers in       Awareness (social awareness, context awareness), and (9) Self-
the LAK dataset. Each document was pre-processed by removing            confidence. These were selected based on our previous content
punctuation, removing numbers, transforming upper case letters to       analysis of publications in the area of education and student
lower case, removing stopwords, and selectively stemming specific       learning.
words. An example of the selective stemming was to treat the words      Student engagement refers to the quality of effort and level of
“engaging” and “engagement” as just derivatives of the word             involvement that students invest in their learning. It has been shown
“engage”. The method of stemming that we applied here is the            to be positively linked to gains in general abilities, critical thinking,
look-up table method where the look-up table is the expanded            and grades [1]. Therefore it has worthwhile effects on student
keyword list from domain analysis.                                      learning and success in education.
Third, a further filtering was implemented to reduce the number of
                                                                        Motivation is a drive, a stimuli, an incentive or desire that causes
terms. The filtering process was done using the expanded keyword
                                                                        someone to act or to expend effort to accomplish something [8].
list in conjunction with association analysis so that potentially
                                                                        Often, it is manifested when students are attentive, participative and
important words not present in the list could be identified and
                                                                        active in class.
added.
Fourth and finally, the pre-processing stage culminated in the          Self-reflection occurs when learners evaluate the breadth and scope
creation of the document-by-term matrix weighted by raw term            of their knowledge. It is important in learning because it helps
frequencies. We were interested in determining which among the          students to identify what they need to learn leading to effective self-
theory inspired antecedents (see Section 3) are discussed in each       regulation [5].
LA study. The document-by-term matrix acted as a springboard            Some researchers view learning as a collaborative process where
from which we explored the construction of other matrices (e.g. co-     learners interact and share knowledge. The roles, activities, and
occurrence matrices) and application of other analytical techniques     behavior that students assume in a social learning context
such as key-phrase extraction.                                          ultimately impact their learning [2].
 All analyses were done using the R software2 and the packages
                                                                        Testing and assessment in general has long been used to assess
tm3, wordnet4, and igraph5.
                                                                        whether students have achieved specific learning outcomes.
2.3 Two assumptions                                                     Furthermore, during testing information is stored in the brain for
We assumed that the mention of keywords associated to a learning        long term retrieval, which in turn is essential for learning transfer
antecedent in the abstract of a paper would indicate that the paper     (i.e. using information in different contexts) and meaning
is dealing with that learning antecedent. We anticipate a number of     generation.
caveats with this assumption. One possible scenario is that the         Recommendation is seen as a potential antecedent of learning since
keyword is used in a different sense. An example is the keyword         it helps students track their learning achievement and improve their
“goal”, in some papers the presence of this word does not mean that     learning at the same time [3].
they are automatically dealing with Goal-setting but it could be the
case that the word “goal” here refers to the goal of the study. Thus    Goals direct attention, energize effort and promote persistence.
it is also important to consider the context in which the word is       Studies have shown the valuable effect of goal-setting to academic
being used. We addressed this by examining other words in the           achievement, self-regulation, and deep learning strategies [6].
abstract. Using association analysis we noticed that when the word
                                                                        Awareness provides context for learning since it discloses
“goal” is used in the sense of Goal-setting words such as
                                                                        information about other person’s activities and the environment
performance, achievement, or learning are also encountered.
                                                                        where learning takes place. It has been shown to be crucial to
Another assumption is that the mention of keywords belonging to         learning and contributes to the quality of active participation [7].
different learning antecedent in one abstract means that these two
learning antecedents are simultaneously addressed and with the          Last is self-efficacy (colloquially termed as self-confidence) which
same emphasis in that paper. We can see a problem with this since       is usually defined as belief in one’s own capability to accomplish
some papers just use the concept but do not develop that concept        tasks and achieve goals [4]. It is important in learning since students


2                                                                       4
    http://www.r-project.org/                                               http://cran.r-project.org/web/packages/wordnet/index.html
3 http://cran.r-project.org/web/packages/tm/index.html                  5 http://cran.r-project.org/web/packages/igraph/index.html
must believe in their own capacity to learn even if the material is              considerable interest of LA researchers in online learning settings
difficult.                                                                       where the capture, measurement, and monitoring of these
                                                                                 antecedents are both challenging and crucial. On the other the less
We added the Analytics to see which LA projects have incorporated                often discussed antecedents are goal-setting, motivation, and self-
advanced analytical tools on top of the basic summarization and                  discipline. Although, goal-setting has a slightly higher bar than
visualization features.                                                          self-reflection this is because some studies that mention the word
                                                                                 “goal” actually referred to the aim or objective of the studies.
4. MAIN FINDINGs AND DISCUSSION
Combining the keywords obtained from the domain analysis,                        Figure 3b depicts both the magnitude of studies that deal with each
association analysis, and corpus analytics we obtained the keyword               antecedent and the relationship (in the sense of co-occurrence)
list in Table 1 that are grouped according to the antecedents that are           among the antecedents. The red circles are the antecedents and the
most likely associated to them.                                                  green ones are the keywords. An edge connects a keyword to its
                                                                                 associated antecedent and edges between antecedents represent
Table 1: Keywords associated to each learning antecedent.
                                                                                 relationship. We include “Analytics” to see which among the
 Learning              Keywords                                                  antecedents make heavy use of analytics and what type of analytics
 Antecedents                                                                     is commonly employed. It is not difficult to observe that social
 Engagement            engage, participate, active, access, resource
                                                                                 learning and awareness are the most related in terms of the number
                                                                                 of publications that tackled them. It is followed by awareness and
 Motivation            motivate, encourage                                       assessment, although there is a strong indication that assessment
                                                                                 here may imply the students’ assessment of their knowledge,
 Self-reflection       negotiate, self-regulate, self-reflect, self-aware,
                       self-discipline, self-test, reflect, self-report, self-   context, peers, and environment and not about test or evaluation.
                       knowledfe                                                 The last subgraph (Figure 3c) visually represent the relationship
 Social Learning       collaborate,  network,    interact,            social,    among words as well as the quantity of studies that mention each
                       community, graph, connect                                 word (as expressed by the size of the circle). It is not surprising to
                                                                                 observe that the word “model” is the leading keyword this is
 Assessment            test, assess
                                                                                 because most LA researchers are concerned with creating models
 Recommendation        recommend, feedback, intervene                            to describe some learning-related phenomena, as to be expected
                                                                                 from an LA research. Another observation that is worth mentioning
 Goal-setting          goal, sub-goal                                            is the conspicuousness of the three vertices that represent visual,
 Awareness             aware, content-aware, track, monitor, compare             network, and interact and the interconnections between them.
                                                                                 These three are indicative of the social learning antecedent since
 Self-Confidence       confidence, self-efficacy                                 interactions among students are usually visualized by means of a
 Analytics             model, student model, user model, analytics,              network structure.
                       analytic, predict, valid, visual, classify
                                                                                 In Table 2, we see the list of words that are highly associated to the
                                                                                 keywords of each antecedent. We discovered these with the use of
                                                                                 association analysis and key-phrase extraction. The list is
From the document-by-term matrix we identified which among the                   incomplete since we just present the ones that were interesting in
documents have used analytics and which learning antecedents are                 our opinion. These words could be used to further enrich our
addressed in each document. We also constructed 4 co-occurrence                  original keyword list. Moreover, we unearthed interesting
matrices (see Figure 2) that reveal which learning antecedents are               relationships such as the association between “affect” and
often treated simultaneously, and which keywords are often                       “engagement”, “assessment” and “scores”, “recommendation” and
mentioned together. A sampling of output is presented in Figure 3.               “similarity”. Some of these associations reveal the kind of
                                                                                 techniques used to analyze particular antecedents (e.g. the use of
                                                                                 the idea of similarity in recommendation) and the underlying
                                                                                 concepts that might govern an antecedent (e.g. the affective state of
                                                                                 a student might indicate or influence engagement).
                                                                                 Table 2: Other terms associated to each antecedent.

                                                                                  Engagement             affect, peripheral, discussion, home
                                                                                  motivation             learnograms
                                                                                  self-reflection        cope, personal, health, feelings
                                                                                  Social learning        blackboard,      intergroup,       intranetwork,
                                                                                                         cyberlearner
Figure 2: Four co-occurrence matrices constructed from the term-                  Assessment             scores
by-document matrix.
                                                                                  Recommendation         similarity
The first subfigure (Figure 3a) shows a bar plot that depicts the
                                                                                  Goal setting           orientation, temporal
number of papers in the LAK dataset that have dealt with each
learning antecedent. It can be vividly seen that the focus of many                Awareness              clues, cope
studies are the learning antecedents awareness, social learning,
                                                                                  Self-confidence        Egocentric, high achieving
engagement, and assessment. This can be explained by the
5. CONCLUSION AND FUTURE WORK                                                 7. REFERENCES
In this study we show how an analysis that combines domain-based              [1]   Aguiar, E. et al. 2014. Engagement vs performance: using electronic
information and corpus analytics could be used to uncover and                       portfolios to predict first semester engineering student retention.
analyse interesting concepts in LA literature. These concepts                       (2014), 103–112.
                                                                              [2]   Barr, J. and Gunawardena, A. 2012. Classroom Salon: A Tool for
directly deal with the question of how LA has been used to improve
                                                                                    Social Collaboration. Proceedings of the 43rd ACM Technical
our understanding and control of a number of learning antecedents.                  Symposium on Computer Science Education (New York, NY, USA,
We believe that to fully answer that question a more detailed                       2012), 197–202.
analysis should be undertaken such as investigating the measures              [3]   Bramucci, R. and Gaston, J. 2012. Sherpa: Increasing Student Success
and validity of the constructed models as described in the                          with a Recommendation Engine. Proceedings of the 2Nd
publications. Nevertheless, our approach clears the cloud to                        International Conference on Learning Analytics and Knowledge
expedite such detailed analysis. Our study also highlights the need                 (New York, NY, USA, 2012), 82–83.
to study other antecedents that might be critical to student learning         [4]   Diseth, Å. 2011. Self-efficacy, goal orientations and learning
                                                                                    strategies as mediators between preceding and subsequent academic
but do not yet receive due research attention. From an educator’s
                                                                                    achievement. Learning and Individual Differences. 21, 2 (Apr. 2011),
perspective it is now becoming clearer how LA solutions impact                      191–195.
learning and to which aspect the contribution is focused. It is now           [5]   Govaerts, S. et al. 2012. The Student Activity Meter for Awareness
time that we move LA from a technique-laden endeavor to a more                      and Self-reflection. CHI ’12 Extended Abstracts on Human Factors
theory driven approach.                                                             in Computing Systems (New York, NY, USA, 2012), 869–884.
                                                                              [6]   Latham, G.P. and Locke, E.A. 2007. New Developments in and
If ever, this work will be selected we also show our effort on the                  Directions for Goal-Setting Research. European Psychologist. 12, 4
temporal analysis of these antecedents such as visualizing the                      (Jan. 2007), 290–300.
evolution of focus of LA studies on each concept. Moreover, we                [7]   Pohl, A. et al. 2012. Sensing the classroom: Improving awareness and
aim to analyze how publications in educational data mining,                         self-awareness of students in Backstage. 2012 15th International
learning analytics and technology-enhanced learning differ in this                  Conference on Interactive Collaborative Learning (ICL) (Sep. 2012),
aspect.                                                                             1–8.
                                                                              [8]   Schiefele, U. Interest, Learning, and Motivation.
6. ACKNOWLEDGEMENT                                                            [9]   Taibi, D. and Dietze, S. 2013. Fostering analytics on learning
We gratefully acknowledge the publishers who have contributed to                    analytics research: the LAK dataset. In: CEUR WS Proceedings Vol
the LAK Dataset: ACM, International Educational Data Mining                         974, Proceedings of the LAK Data Challenge, held at LAK2013 - 3rd
                                                                                    International     Conference      on     Learning   Analytics    and
Society and Journal on Education Technology &Society. We are
                                                                                    Knowledge(Leuven, BE, Apr. 2013).
grateful for the financial support of the Eduworks Marie Curie
Initial Training Network Project (PITN-GA-2013-608311) of the
European Commissions’s 7th Framework Programme.




                                   (a)
                                                                                                            (b)




                                                                        (c)
                Figure 3: Sampling of the output from the analysis