=Paper= {{Paper |id=Vol-1518/paper1 |storemode=property |title=Visualizing Uncertainty in the Prediction of Academic Risk |pdfUrl=https://ceur-ws.org/Vol-1518/paper1.pdf |volume=Vol-1518 |dblpUrl=https://dblp.org/rec/conf/lak/Ochoa15 }} ==Visualizing Uncertainty in the Prediction of Academic Risk== https://ceur-ws.org/Vol-1518/paper1.pdf
Visualizing Uncertainty in the Prediction of Academic Risk

                                                       Xavier Ochoa
                                          Escuela Superior Politécnica del Litoral
                                                Vía Perimetral, Km. 30.5
                                                   Guayaquil, Ecuador
                                                xavier@cti.espol.edu.ec


ABSTRACT                                                         terfaces” [14]. Visual Analytics differentiates from simple
This work proposes a generic visual representation to help       data visualization because its purpose is not only presenting
relevant decision-makers to effectively address the inherent     the information resulting from a predefined analysis process,
uncertainty present in the prediction of academic risk based     but empowering the decision-maker to control the analysis
on historical data. The three main sources of uncertainty in     process and interact with the multiple dimensions that the
this type of prediction are visualized: the model predictive     resulting information could have to gain a deep understand-
power, the data consistency and the case completeness of the     ing of the implications that those results have in the decision
historic dataset. To demonstrate the proposed visualization      at hand.
technique, it is instantiated in a real-world scenario where
the risk to fail at least one course in an academic semester     Currently, there are very few early examples of Visual Learn-
is predicted and presented in a student-counseling system.       ing Analytics, in contrast to simple visualization of Learning
This work also proposes how this visualization technique can     Analytics results: Lemo [7] is a system that use interactive
be evaluated and applied to other Visual Learning Analytics      visualization to help instructors understand the activity logs
tools.                                                           of LMSs. The end-user is capable of exploring the dataset
                                                                 through selecting and filtering the desired information in a
Categories and Subject Descriptors                               variety of visualization options. Gomez et al. [8] also create
                                                                 a system to explore in deeper detail the academic and non-
K.3.1 [Computing Milieux]: Computers and Education-
                                                                 academic data stored in the LMS system through the use of
Computer Uses in Education
                                                                 interactive visualizations.

Keywords                                                         One virtually unexplored avenue of Visual Learning Analyt-
Visual Learning Analytics, Uncertainty Visualization, Aca-       ics is how to make explicit the uncertainty that is inherent in
demic Risk                                                       any analysis process in a way in which is meaningful for the
                                                                 decision-maker. Moreover, if possible, the decision-maker
1.   INTRODUCTION                                                should also be able to manipulate the analysis process to
The main goal of the Learning Analytics field is to provide      adjust the uncertainty to a level where he or she finds appro-
relevant information to the actors of the learning process       priate. This kind of techniques to present and manage the
(students, instructors and administrators) to help them take     uncertainty are common in more mature fields such as me-
better learning-related decisions. A considerable amount of      teorology (e.g. hurricane path prediction uncertainty [13]),
research effort [6] has been invested in find ways to ana-       medicine (e.g. uncertainty in the effect of medical interven-
lyze the large amount of traces that are a by-product of         tions [10]) and economy (e.g. uncertainty in the prediction
the learning process to convert it into that relevant infor-     of future growth [13]). There exists, however, some examples
mation. An equal important, but lesser researched, area          of the visualization of uncertainty in Open Learner Models
of Learning Analytics explores the best ways in which that       [5] that could be consider a precursor in the field of Visual
relevant information is presented to the final user to max-      Learning Analytics.
imize its usefulness for decision-making. This second area
is often called “Visual Learning Analytics” given that it is     This work will focus on how Visual Learning Analytics tech-
very related to the field of Visual Analytics, that focuses      niques could be used to visualize and control the inherent
on “analytical reasoning facilitated by interactive visual in-   uncertainty in the prediction of academic risk. The orga-
                                                                 nization of this paper is as follows: Section 2 explores how
                                                                 academic risk is usually obtained and which are the main
                                                                 sources of uncertainty in this type of analysis. Section 3
                                                                 discusses how the prediction value, together with the main
                                                                 uncertainty values should be visualized. Section 4 presents
                                                                 a case-study where the visualization techniques are instan-
                                                                 tiated to help counselors give advice about the risk to fail a
                                                                 semester to individual students. Finally, the paper finishes
                                                                 with conclusions about the work and guides for further work
to evaluate the technique and how to adapt it to other Vi-        models are built to take as input a group of predictor vari-
sual Learning Analytics tools.                                    ables and to produce a predicted value. Given that models
                                                                  are only an approximation and simplification of reality, it
                                                                  is expected that the predicted values differ, in different de-
2.    PREDICTING ACADEMIC RISK                                    grees, from the real values. A whole area of Statistics is
In the context of this work, the term “academic risk” is de-      devoted to measure the predictive power of different types
fined as the probability of a student to reach an unfavorable     of models. The best example of the measure of the predic-
outcome in their studies. This unfavorable outcome could be       tive power is the R-squared statistic used to score regression
as benign as the failure to submit a homework or as costly        models. This measurement establishes what percentage of
as dropping-out of a program. As very little can be done          the variance in the real values of the predicted quantity are
once the unfavorable outcome has been already reached, es-        explained by the model.Different models usually have differ-
pecially for the more costly forms (e.g. failing a course or      ent predictive power depending on the predictor variables
dropping-out), there is a strong incentive to being able to       used, the type of algorithm and the amount and quality of
estimate the academic risk of the student, or what is equiv-      data used to build them. It is a common practice to evalu-
alent, predict the probability that the student will, with-       ate different competing models and select the one with the
out intervention, reach the unfavorable outcome. Due to its       best predictive power according to an appropriate scoring
importance, predicting different forms of academic risk has       function.
been one of the oldest forms of Learning Analytics [11].
                                                                  2.1.2    Dataset Limitations
There are several current examples of systems that seek to        Given that most academic risk predictors are built based on
estimate different kinds of academic risks: Signals [1] is ar-    historical or current data, the characteristics of the data and
guably the poster-boy of learning analytics systems to pre-       its limitations play a major role in the overall uncertainty
dict academic risk. Using historical and current information      of the predicted value of that risk. The work of Thomson
about the behavior of a student in a course, it is able to pre-   et al. [15] established a detailed typology for the limitations
dict the probability that the student has of fail the course.     of data that affect certainty in predictive models: accuracy,
Another, more simple approach is taken by StepUp! [12]            precision, completeness, consistency, lineage, currency, cred-
that just compares the activity of a student with the ac-         ibility, subjectivity and interrelatedness. All these types of
tivity of their peers and assigns a ranking value that could      limitations are usually defined at the dataset level and their
be seen as a fuzzy academic risk predictor. Finally, there        effect in uncertainty is usually propagated into the final pre-
are several modern drop-out risk predictors from which the        dictive power of the model that was built with that dataset.
work of Dekker et al. [4] could be considered a good repre-
sentative. This system uses a classification tree trained over    Given the nature of academic datasets, the most important
historical data in order to obtain rules to assess the risk of    of these dimensions are consistency and subjectivity. Histor-
a student to dropping-out from a university program.              ical academic data, for example final grades of students, is
                                                                  generally accurate (there is a significant cost of registering a
All of the mentioned systems used data collected from pre-        grade wrongly), precise (it has enough resolution to separate
vious or current students to create a prediction model. This      passing and failing students), complete (all students should
model could be built with statistical or data-mining meth-        have grades or at least a pass/fail at the end of a course),
ods. Once the model has been built, it is fed with the in-        current (the grades are producing during the course or at
formation from the student target of the prediction and an        least very close to the ending of the course) and credible
estimation of the academic risk is produced. This estima-         (the academic institutions will have serious problems if their
tion is normally presented to the instructor, counselor or the    academic records are not credible). Also, academic records
student through some form of visualization technique.             have no major problems with lineage (the grades are rarely
                                                                  processed after the instructor records them) and the records
In all of the steps of the above-mentioned process there are      do not suffer from interrelatedness (instructors do not copy
inherent uncertainties that are propagated and contribute         the grades from one student to another or among them).
to the uncertainty that is present in the estimated value of      However, consistency of academic data could introduce un-
academic risk. The following subsection discusses the nature      certainty in the prediction of academic risk. As academic
of these sources of uncertainty and their relative importance     programs evolve, they also change: the courses offered could
for the prediction.                                               change, the grading rules could become more strict or more
                                                                  relaxed, different instructors will imprint their own charac-
2.1     Uncertainty Sources                                       teristic in the courses, among other changes. Depending
To facilitate the analysis of the different sources of inher-     on the nature and magnitude of the changes, the academic
ent uncertainty in the prediction of academic risk, they are      records of a current student and one that studied ten years
classified in two group according to their origin: predictive     ago could not be comparable or, more dangerously for pre-
model limitations and dataset limitations. The following          diction models, could provide a false sense of similarity when
subsections sub-classify these two groups into more concise       in reality the values in those records are not measuring the
and measurable uncertainty values.                                same students characteristics. Another possible limitation of
                                                                  historical academic data is its subjectivity. Grades, scores
2.1.1    Predictive Model Limitations                             and student evaluations are commonly assigned according to
Perhaps the most obvious source of uncertainty introduced         the criteria of the instructor. Even during the same course,
in any type of prediction is the one introduced by the im-        students that did a similar level of work could receive dif-
perfections of the predictive model. In general, predictive       ferent grades. While the effect of consistency errors in the
overall prediction uncertainty could be limited by only con-      niques that are common and proved useful [3] in other fields
sidering comparable years of the academic program in the          to represent the predicted value, together with the different
dataset, the uncertainty produced by the subjectivity could       uncertainty produced by the sources described in the pre-
not be reduced if it is already present in the data.              vious section: the model predictive power, the data consis-
                                                                  tency and the case completeness. The goal of the visualiza-
Due to the fact that most academic risk predictors compare        tion of those values is to present the most information about
current students to previous similar students that were in        the prediction in an interpretable and useful way. The fol-
a similar context, another type of data limitation plays a        lowing subsection proposes various techniques for each one
role in the overall uncertainty of the prediction: case com-      of these elements in detail.
pleteness. For example, predictive model A estimates the
academic risk of failing a course based on number of other        3.1    Predicted Risk Value
courses taken at the same time and the GPA of the student;        The value of the academic risk of a student, being just a
predictive model B estimates the academic risk of failing a       scalar that can be expressed as an easily interpretable nu-
course based on the number of courses taken at the same           meric value between 0 and 1 (as probability) or from 0% to
time, the GPA of the student, the fact that the student           100% (as relative frequency) can be presented using a large
has an external job, if the student is married, the number        variety of visualization techniques such as textual, progress
of children the student has, the distance from his house to       arc, gauge or bullet graphs. Figure 1 shows an example of
the university and the number of courses taken before the         this type of visualizations. Attached to the visualization
current one. Both models estimate the academic risk of fail-      of the value, all of these types of visualization present the
ing the course as the percentage of similar students that         decision-maker with a pre-defined guide to assess the level
have failed the course in the past. A hypothetical predic-        of risk described depending on the magnitude of the value.
tion power analysis shows that model B is less uncertain          In the case of textual and arch representations, the color
that model A. However, this prediction power is calculated        of the text or the arch (e.g. green, yellow and red) or an
for the general population, for some students model A could       additional iconic representation (e.g. traffic light) could be
be less uncertain than model B. Lets suppose that student A       used to provide an indication of the severity of the risk. In
is taking 3 other courses, has a GPA of 3.5, has an external      the case of the gauge and bullet graphs, different ranges can
job, is married, has 5 children, lives 100 km from the uni-       be color-coded to also provide this information. Some previ-
versity and has taken just one course before the current one.     ous implementations of visualization of academic risk, such
Lets suppose too that this is a very unusual combination of       as Signals [1], use only an iconic representation (the traffic
values for the students of this specific course. If the model A   light approach) to represent the predicted value. Repre-
is applied, only the number of other courses that the student     senting only the range in which the value is, instead of the
is currently taking (3) and his or her GPA (3.5) are consid-      actual value is used to account for the uncertainty of the
ered. These two values, by themselves, are not unusual, so it     prediction. However, in most cases, those ranges are crisp,
is probable that there will be several previous students that     meaning that a single unit change in the predictive value can
could be considered similar. The prediction of academic risk      cause the color to change, defeating the purpose of present-
for the hypothetical student will be drawn from a large pool      ing only ranges in the first place. For example, a student
of previous experiences. If the model B is applied, due to        with a risk of 0.50 will be coded with green, while a student
the unusual values of the rest of variables, the model could      with a risk of 0.51 will be coded yellow. With just the iconic
only find one other student close enough to be considered         representation, there is no way for the decision-maker to es-
similar in the dataset. In this situation, the prediction of      tablish if the students is closer to green or to red. Moreover,
academic risk for the student will be 100%, if the previous       the span of the ranges (what values are considered to be
student failed the course or 0% if he or she passed. While,       green, yellow or red) is often also unknown to the decision-
in general, model B has more predictive power than model          maker. Using only the iconic representation is discouraged
A, for this particular student the approximate estimation         given that this work present other ways to deal with the in-
of model A will be much more less uncertain than the one          herent uncertainty in the prediction.
provided by model B, due to the lack of similar cases in the
dataset. The prediction for “outlier” students, that is, stu-
dents that have few similar students in the dataset, is less
certain than the prediction for “mainstream” students, that
has a large collection of similiar cases. Simple models have
                                                                  3.2    Model Predictive Power
                                                                  Similarly to the predicted risk value, the model predictive
less similarity dimensions, and the number of possible cases
                                                                  power is also an scalar magnitude. Contrary to risk probabil-
is lower than in complex models with larger dimensions sets.
                                                                  ity, the meaning of the output of the different model-scoring
The variety and quantity of cases in the dataset, that is the
                                                                  techniques (such as R-squared, BIC, AIC, Brier score, etc.)
case completeness of the dataset, introduce a uncertainty
                                                                  are far from being easy to interpret by non-statisticians. To
factor that varies from student to student and depends on
                                                                  effectively communicate the predictive power of the model,
the complexity of the model.
                                                                  or what is the same, the level of uncertainty that a given
                                                                  model will introduce in the prediction, the expert analyst
3.   VISUALIZING UNCERTAINTY                                      in charge of the academic risk prediction should define a set
As mentioned in the introduction, the visualization of uncer-     of iconic representations (e.g. traffic lights, happy-sad faces,
tainty is already an established feature in more mature fields.   plus signs, etc.) to correspond with different values of pre-
In Visual Learning Analytics, however there are still no thor-    dictive power. Given that usually there are no model with
oughly evaluated techniques. The most recommended path            “bad” power (otherwise it will not be used in the analysis), it
in this case will be to adapt uncertainty visualization tech-     is recommended that a plus signs textual representation (“+”
                                                                     Figure 3: Data consistency visualization for a course
                                                                     historical data


                                                                     In most predictive models is easy to obtain a measure of how
                                                                     many “similar” elements are considered at the moment of ob-
                                                                     taining the predictive value for a given element. In the case
                                                                     of academic data, the case completeness could be measured
Figure 1: Predicted value visualization: a) tex-                     as the number of records that are directly used to calculate
tual representation, b) progress arc graph, c) gauge                 the academic risk of a given student. This number could go
graph and d) bullet graph                                            from 0 to the total number of records in the dataset. A low
                                                                     value is an indication of a high uncertainty in the predicted
                                                                     value. Higher values, usually larger than 30, are enough
for lower scoring models, “++” for medium scoring models             to discount the number of cases as a source of uncertainty.
and “+++” for the best scoring models) is used to represent          The recommended visualization technique for this value is an
different levels of power. The words “Good”, “Very Good”             iconic representation with icons that represent alert states at
and “Excellent” could be complement or replace this visual-          different number of different cases pre-defined by the expert
ization. An example of this visualization could be seen in           behind the analysis (e.g. a red stop sign for values between
Figure 2.                                                            0 and 5, a yellow exclamation mark for values between 5
                                                                     and 30 and a green check for values higher than 30). To-
It is important to note that this visualization is only nec-         gether with the icon, a textual representation of the number
essary when the decision-maker can select between different          of cases could be included to improve understandability (e.g.
models or the system chooses the model based on the avail-           This prediction is based only on 3 previous cases). Figure 4
able data. If the predictive risk is using a single model, the       presents an example of this visualization.
value of presenting this extra information is diminished.




  Figure 2: Model predictive power visualization
                                                                     Figure 4: Case completeness visualization based on
3.3    Data Consistency                                              iconic representation
The representation of uncertainty introduce by the data in-
consistency is challenging given that there is no way to pre-        3.5    Interaction
cisely measure it. In the case of academic datasets, the con-        The visualization described in the previous sub-section could
sistency is related to the changes in different aspects of the       help the decision-maker to better understand the inherent
study program or course over time. It is expected that the           uncertainty of the risk value prediction. However, if the
closer in time the historic data is, the greater the level of con-   decision-maker is not confortable with the uncertainty of
sistency and the lower the level of uncertainty. If there exists     the prediction the only course of action is to discard the
a record of major changes in the academic program (course            prediction. As mentioned in Section 2, the uncertainty of
changes, evaluation policies changes, etc) or the courses (syl-      the prediction depends on several factors such as the model
labus change, pre-requisites changes, instructor change, etc),       used, the length of historical data used and the number of
they can be plotted in a timeline that span over the whole           similar cases used by the model to generate the prediction
data range of the historical data. In this way, instructors          for a given student. The trade-off between these parame-
and counselors that are familiar with the history of the pro-        ters is decided by the expert in charge of the prediction.
gram or course could recognize the changes and adjust their          Usually the model selected will be the one with greatest
perception of the uncertainty introduced in the prediction,          predictive power and the range of historical data will be se-
while students or users not familiar with the history of the         lected to maximize this number. This selection is bound to
program or course could just count the number of changes             be sub-optimal for some students, specially those with spe-
to form their own estimation of the uncertainty in the pre-          cial cases. The use of interactive visualization transfer the
diction, although less precise than the ones with previous           control of the analysis parameters to the decision-maker. He
knowledge. An example of this type of visualization can be           or she could adjust them in order to reach the lowest level
seen in Figure 3                                                     of uncertainty possible for a given student and the domain
                                                                     knowledge that the decision-maker has about the academic
3.4    Case Completeness                                             program or course.
Very simple interactive controls could be added to the vi-        was used to create 10 clusters). The semesters were clustered
sualization in order to control the main parameters affect-       at five levels (all using Fuzzy C-means): Level 1, based on
ing uncertainty factors. Each time a new value is selected        the total load of the courses calculated from their difficulty
on those controls, the uncertainty visualizations should be       [9]; Level 2, based on the typology of courses; Level 4, based
updated enabling the exploration of the uncertainty space         on the grades that the students obtain in the courses [9];
by the decision-maker. To control the uncertainty resulting       Level 4, based on the knowledge area of the courses; Level
from the predictive power of the model, the decision-maker        5, based on the actual name of the courses. The intersection
could be presented with a set of widgets where the model          of the level student and semester clustering defines a pre-
algorithm or parameters could be selected. To control the         dictive model. For example, a model is created by finding
uncertainty resulting from the lack of consistency in the his-    similar students based on their GPA taking similar semesters
torical records, the timeline where this information is pre-      based on the difficulty of courses taken (Level 2). The pre-
sented could be complemented with a selection bar to select       dictive power of the models was obtained computing the
subsets of the whole time period. The uncertainty produced        Brier score [16] of the forecast made for the last semester
by the lack of similar cases could not be affected directly,      (2013-2) with the models built from the data from all the
but it will change its response to the changes in the model       previous semesters.
used and the selected time period.

                                                                  4.3      Visualizing the Prediction
4.    CASE-STUDY: RISK TO FAIL                                    Figure 5 presents the interactive visualization created for
To illustrate the ideas presented in the previous sections,       the case-study academic risk prediction application. All the
they will be applied to a real-world academic risk prediction     elements discussed in Section 3 are present. The predicted
application. This application is part of a larger counseling      value is presented using a bullet graph with a 0%-100% scale,
system used regularly by professors and students at a mid-        a yellow interval between 50% and 75% and a red interval be-
size university in Ecuador. The goal of this application is       tween 75% and 100%. The model prediction power is shown
to determine the academic risk of failing at least in the next    with an iconic representation of one, two or three plus signs,
semester based on the planned course selection and study          together with a textual description. The data consistency is
load. To produce this prediction the application uses a va-       represented with an interactive timeline indicating the major
riety of models that cluster the student and the planned          events that changed the Computer Science program during
semester with similar students and semesters in the histor-       the analyzed period. The case completeness of the dataset
ical dataset. The models calculate the risk based on the          for the target student is presented using an iconic represen-
previous frequency of similar students in similar semesters       tation of group of different amounts of people related to a
that failed at least one course. The counselor could interact     color (one individual in red to indicate a large amount of
with the visual analysis by selecting the courses that the stu-   uncertainty, few people in yellow to represent middle values
dent will take the next semester, the type of clustering that     and a green crowd to represent low values. Finally, selection
is applied to select similar students and semesters and the       boxes are presented to the decision-maker to define the levels
time period used to obtain similar cases. The counselor is        of clustering (for students and semesters) that determine the
presented with a prediction of the probability of the student     model that will be used for the prediction. All of these vi-
failing the course and the visualization of the uncertainty       sualizations and controls are implemented with easy-to-use
produced by the model, the data consistency and case com-         D3 Javascript visualization library 1 .
pleteness. The counselor use the information received to
recommend the student to take more or less study load in
the coming semester.                                              5.      CONCLUSIONS AND FURTHER WORK
                                                                  Visualizing the uncertainty in the prediction of academic
4.1    Dataset                                                    risk, specially in an interactive way, has the potential to im-
                                                                  prove the usefulness of this type of systems. Even simple
The dataset used for this application was built based on
                                                                  techniques are able to present to the decision-maker with
a Computer Science program at the target university. All
                                                                  the information needed to assess the uncertainty of the pre-
the courses taken by CS students each semester and the
                                                                  diction for different selections of model and historical train-
grades obtained in those courses were stored since the first
                                                                  ing data. With an interactive visualization the decision-
semester of 1978 to the second semester 2013. The courses
                                                                  maker, with their domain-expertise knowledge, becomes a
that have changed name were grouped together according to
                                                                  co-designer of the analytic process, instead of a simple user
the transition rules during those changes. A total of 30.929
                                                                  of the results of the analysis. Implementing this visualiza-
semesters were taken by 2.480 different students.
                                                                  tion in real-world scenarios is simple given that the sources
                                                                  of uncertainty are well understood and could be measured
4.2    Predictions Models                                         or estimated.
A multi-level clustering approach was used to build differ-
ent models to find similar students and calculate the aca-        The main task to be completed in this research is the real-
demic risk value. Two main variables controlled the gener-        world evaluation of the visualization to establish the answers
ation of the different models: the student similarity and the     to two main questions: 1) Is the visualization contributing
semester similarity. The students were clustered at three         to the understanding of the inherent uncertainty of the pre-
levels: No clustering at all (all the students were considered    diction of academic risk? and 2) Is the knowledge about
similar), clustering based on GPA values (five clusters based     the uncertainty helping the decision-maker to make better
on range) and clustering based on similarity of grades in the
                                                                  1
different courses (the Fuzzy C-means (FCM) algorithm [2]              D3.JS visualization library - http://d3js.org
Figure 5: Example of visualization integrated in the counseling system: 1) Course selector, 2) Predicted
academic risk value visualization, 3) Model selector, 4) Time period selector and consistency visualization,
5) Model predictive power visualization and 6) Case completeness visualization


decisions or to provide better advice? To answer these ques-     Onderzoek - Vlaanderen (FWO) in Belgium through the
tions, the tool presented in the case study will be used in      funding of the “Managing Uncertainty in Visual Analytics”
two experimental groups of counselors. One group will see        project, to which this work belongs.
the prediction and the uncertainty visualization. The second
group will see only the prediction visualization. A third con-
trol group will continue to use the counseling system without    7.   REFERENCES
the academic risk predictor application. The average failure      [1] K. E. Arnold and M. D. Pistilli. Course signals at
rate for each counselor will be recorded at the end of the            purdue: Using learning analytics to increase student
semester and compared with the failure rate between ex-               success. In Proceedings of the 2nd International
perimental and control group and also with the failure rate           Conference on Learning Analytics and Knowledge,
from previous semesters. Surveys will be conducted just af-           pages 267–270. ACM, 2012.
ter the counseling sessions in order to establish the level of    [2] J. C. Bezdek. Pattern recognition with fuzzy objective
understanding of the uncertainty in the prediction.                   function algorithms. Kluwer Academic Publishers,
                                                                      1981.
Finally, the ideas presented in this paper could be adapted       [3] S. Deitrick and R. Edsall. The influence of uncertainty
to other types of Visual Learning Analytics tools, especially         visualization on decision making: An empirical
those focused on prediction and forecasting. The methodol-            evaluation. Springer, 2006.
ogy followed in this paper could be a general framework for       [4] G. W. Dekker, M. Pechenizkiy, and J. M.
these adaptations: 1) exploring the main sources of uncer-            Vleeshouwers. Predicting students drop out: A case
tainty in the analysis, 2) establishing methods to measure            study. In International Conference on Educational
or estimate the uncertainty contribution of those sources, 3)         Data Mining (EDM). ERIC, 2009.
using existing visualization techniques to present the uncer-     [5] C. Demmans-Epp, S. Bull, and M. Johnson.
tainty values in a way that will be easy to interpret by the          Visualising uncertainty for open learner model users.
end-user, 4) provide control to the end-user through interac-         In CEUR Proceedings associated with UMAP 2014,
tive visualizations to change the parameters to the models            2014.
and to select the desired data and 5) evaluate the impact         [6] R. Ferguson. Learning analytics: drivers,
of the visualization. Visualizing the uncertainty is a way to         developments and challenges. International Journal of
empower the user of Visual Learning Analytics tools, stress-          Technology Enhanced Learning, 4(5):304–317, 2012.
ing that automatic analysis could support, but not replace,       [7] A. Fortenbacher, L. Beuster, M. Elkina, L. Kappe,
human judgment.                                                       A. Merceron, A. Pursian, S. Schwarzrock, and
                                                                      B. Wenzlaff. Lemo: A learning analytics application
6.   ACKNOWLEDGMENTS                                                  focussing on user path analysis and interactive
The author wants to acknowledge the contribution of Secre-            visualization. In Intelligent Data Acquisition and
tarı́a Nacional de Educación Superior, Ciencia y Tecnologı́a         Advanced Computing Systems (IDAACS), 2013 IEEE
(SENESCYT) in Ecuador and the Fonds Wetenschappelijk                  7th International Conference on, pages 748 – 753,
     2013.
 [8] D. Gomez, C. Suarez, R. Theron, and F. Garcia.
     Advances in Learning Processes, chapter Visual
     Analytics to Support E-learning. InTech, 2010.
 [9] G. Méndez, X. Ochoa, and K. Chiluiza. Techniques for
     data-driven curriculum analysis. In Proceedings of the
     Fourth International Conference on Learning
     Analytics And Knowledge, LAK ’14, pages 148–157,
     New York, NY, USA, 2014. ACM.
[10] M. C. Politi, P. K. Han, and N. F. Col.
     Communicating the uncertainty of harms and benefits
     of medical interventions. Medical Decision Making,
     27(5):681–695, 2007.
[11] C. Rampell. Colleges mine data to predict dropouts.
     The chronicle of higher education, 54(38):A1, 2008.
[12] J. L. Santos, K. Verbert, S. Govaerts, and E. Duval.
     Addressing learner issues with stepup!: an evaluation.
     In Proceedings of the Third International Conference
     on Learning Analytics and Knowledge, pages 14–22.
     ACM, 2013.
[13] D. Spiegelhalter, M. Pearson, and I. Short. Visualizing
     uncertainty about the future. Science,
     333(6048):1393–1400, 2011.
[14] J. Thomas and P. C. Wong. Visual analytics. IEEE
     Computer Graphics and Applications, 24(5):0020–21,
     2004.
[15] J. Thomson, E. Hetzler, A. MacEachren, M. Gahegan,
     and M. Pavel. A typology for visualizing uncertainty.
     In Electronic Imaging 2005, pages 146–157.
     International Society for Optics and Photonics, 2005.
[16] D. S. Wilks. Statistical methods in the atmospheric
     sciences, volume 100. Academic press, 2011.