=Paper=
{{Paper
|id=Vol-1518/paper1
|storemode=property
|title=Visualizing Uncertainty in the Prediction of Academic Risk
|pdfUrl=https://ceur-ws.org/Vol-1518/paper1.pdf
|volume=Vol-1518
|dblpUrl=https://dblp.org/rec/conf/lak/Ochoa15
}}
==Visualizing Uncertainty in the Prediction of Academic Risk==
Visualizing Uncertainty in the Prediction of Academic Risk Xavier Ochoa Escuela Superior Politécnica del Litoral Vía Perimetral, Km. 30.5 Guayaquil, Ecuador xavier@cti.espol.edu.ec ABSTRACT terfaces” [14]. Visual Analytics differentiates from simple This work proposes a generic visual representation to help data visualization because its purpose is not only presenting relevant decision-makers to effectively address the inherent the information resulting from a predefined analysis process, uncertainty present in the prediction of academic risk based but empowering the decision-maker to control the analysis on historical data. The three main sources of uncertainty in process and interact with the multiple dimensions that the this type of prediction are visualized: the model predictive resulting information could have to gain a deep understand- power, the data consistency and the case completeness of the ing of the implications that those results have in the decision historic dataset. To demonstrate the proposed visualization at hand. technique, it is instantiated in a real-world scenario where the risk to fail at least one course in an academic semester Currently, there are very few early examples of Visual Learn- is predicted and presented in a student-counseling system. ing Analytics, in contrast to simple visualization of Learning This work also proposes how this visualization technique can Analytics results: Lemo [7] is a system that use interactive be evaluated and applied to other Visual Learning Analytics visualization to help instructors understand the activity logs tools. of LMSs. The end-user is capable of exploring the dataset through selecting and filtering the desired information in a Categories and Subject Descriptors variety of visualization options. Gomez et al. [8] also create a system to explore in deeper detail the academic and non- K.3.1 [Computing Milieux]: Computers and Education- academic data stored in the LMS system through the use of Computer Uses in Education interactive visualizations. Keywords One virtually unexplored avenue of Visual Learning Analyt- Visual Learning Analytics, Uncertainty Visualization, Aca- ics is how to make explicit the uncertainty that is inherent in demic Risk any analysis process in a way in which is meaningful for the decision-maker. Moreover, if possible, the decision-maker 1. INTRODUCTION should also be able to manipulate the analysis process to The main goal of the Learning Analytics field is to provide adjust the uncertainty to a level where he or she finds appro- relevant information to the actors of the learning process priate. This kind of techniques to present and manage the (students, instructors and administrators) to help them take uncertainty are common in more mature fields such as me- better learning-related decisions. A considerable amount of teorology (e.g. hurricane path prediction uncertainty [13]), research effort [6] has been invested in find ways to ana- medicine (e.g. uncertainty in the effect of medical interven- lyze the large amount of traces that are a by-product of tions [10]) and economy (e.g. uncertainty in the prediction the learning process to convert it into that relevant infor- of future growth [13]). There exists, however, some examples mation. An equal important, but lesser researched, area of the visualization of uncertainty in Open Learner Models of Learning Analytics explores the best ways in which that [5] that could be consider a precursor in the field of Visual relevant information is presented to the final user to max- Learning Analytics. imize its usefulness for decision-making. This second area is often called “Visual Learning Analytics” given that it is This work will focus on how Visual Learning Analytics tech- very related to the field of Visual Analytics, that focuses niques could be used to visualize and control the inherent on “analytical reasoning facilitated by interactive visual in- uncertainty in the prediction of academic risk. The orga- nization of this paper is as follows: Section 2 explores how academic risk is usually obtained and which are the main sources of uncertainty in this type of analysis. Section 3 discusses how the prediction value, together with the main uncertainty values should be visualized. Section 4 presents a case-study where the visualization techniques are instan- tiated to help counselors give advice about the risk to fail a semester to individual students. Finally, the paper finishes with conclusions about the work and guides for further work to evaluate the technique and how to adapt it to other Vi- models are built to take as input a group of predictor vari- sual Learning Analytics tools. ables and to produce a predicted value. Given that models are only an approximation and simplification of reality, it is expected that the predicted values differ, in different de- 2. PREDICTING ACADEMIC RISK grees, from the real values. A whole area of Statistics is In the context of this work, the term “academic risk” is de- devoted to measure the predictive power of different types fined as the probability of a student to reach an unfavorable of models. The best example of the measure of the predic- outcome in their studies. This unfavorable outcome could be tive power is the R-squared statistic used to score regression as benign as the failure to submit a homework or as costly models. This measurement establishes what percentage of as dropping-out of a program. As very little can be done the variance in the real values of the predicted quantity are once the unfavorable outcome has been already reached, es- explained by the model.Different models usually have differ- pecially for the more costly forms (e.g. failing a course or ent predictive power depending on the predictor variables dropping-out), there is a strong incentive to being able to used, the type of algorithm and the amount and quality of estimate the academic risk of the student, or what is equiv- data used to build them. It is a common practice to evalu- alent, predict the probability that the student will, with- ate different competing models and select the one with the out intervention, reach the unfavorable outcome. Due to its best predictive power according to an appropriate scoring importance, predicting different forms of academic risk has function. been one of the oldest forms of Learning Analytics [11]. 2.1.2 Dataset Limitations There are several current examples of systems that seek to Given that most academic risk predictors are built based on estimate different kinds of academic risks: Signals [1] is ar- historical or current data, the characteristics of the data and guably the poster-boy of learning analytics systems to pre- its limitations play a major role in the overall uncertainty dict academic risk. Using historical and current information of the predicted value of that risk. The work of Thomson about the behavior of a student in a course, it is able to pre- et al. [15] established a detailed typology for the limitations dict the probability that the student has of fail the course. of data that affect certainty in predictive models: accuracy, Another, more simple approach is taken by StepUp! [12] precision, completeness, consistency, lineage, currency, cred- that just compares the activity of a student with the ac- ibility, subjectivity and interrelatedness. All these types of tivity of their peers and assigns a ranking value that could limitations are usually defined at the dataset level and their be seen as a fuzzy academic risk predictor. Finally, there effect in uncertainty is usually propagated into the final pre- are several modern drop-out risk predictors from which the dictive power of the model that was built with that dataset. work of Dekker et al. [4] could be considered a good repre- sentative. This system uses a classification tree trained over Given the nature of academic datasets, the most important historical data in order to obtain rules to assess the risk of of these dimensions are consistency and subjectivity. Histor- a student to dropping-out from a university program. ical academic data, for example final grades of students, is generally accurate (there is a significant cost of registering a All of the mentioned systems used data collected from pre- grade wrongly), precise (it has enough resolution to separate vious or current students to create a prediction model. This passing and failing students), complete (all students should model could be built with statistical or data-mining meth- have grades or at least a pass/fail at the end of a course), ods. Once the model has been built, it is fed with the in- current (the grades are producing during the course or at formation from the student target of the prediction and an least very close to the ending of the course) and credible estimation of the academic risk is produced. This estima- (the academic institutions will have serious problems if their tion is normally presented to the instructor, counselor or the academic records are not credible). Also, academic records student through some form of visualization technique. have no major problems with lineage (the grades are rarely processed after the instructor records them) and the records In all of the steps of the above-mentioned process there are do not suffer from interrelatedness (instructors do not copy inherent uncertainties that are propagated and contribute the grades from one student to another or among them). to the uncertainty that is present in the estimated value of However, consistency of academic data could introduce un- academic risk. The following subsection discusses the nature certainty in the prediction of academic risk. As academic of these sources of uncertainty and their relative importance programs evolve, they also change: the courses offered could for the prediction. change, the grading rules could become more strict or more relaxed, different instructors will imprint their own charac- 2.1 Uncertainty Sources teristic in the courses, among other changes. Depending To facilitate the analysis of the different sources of inher- on the nature and magnitude of the changes, the academic ent uncertainty in the prediction of academic risk, they are records of a current student and one that studied ten years classified in two group according to their origin: predictive ago could not be comparable or, more dangerously for pre- model limitations and dataset limitations. The following diction models, could provide a false sense of similarity when subsections sub-classify these two groups into more concise in reality the values in those records are not measuring the and measurable uncertainty values. same students characteristics. Another possible limitation of historical academic data is its subjectivity. Grades, scores 2.1.1 Predictive Model Limitations and student evaluations are commonly assigned according to Perhaps the most obvious source of uncertainty introduced the criteria of the instructor. Even during the same course, in any type of prediction is the one introduced by the im- students that did a similar level of work could receive dif- perfections of the predictive model. In general, predictive ferent grades. While the effect of consistency errors in the overall prediction uncertainty could be limited by only con- niques that are common and proved useful [3] in other fields sidering comparable years of the academic program in the to represent the predicted value, together with the different dataset, the uncertainty produced by the subjectivity could uncertainty produced by the sources described in the pre- not be reduced if it is already present in the data. vious section: the model predictive power, the data consis- tency and the case completeness. The goal of the visualiza- Due to the fact that most academic risk predictors compare tion of those values is to present the most information about current students to previous similar students that were in the prediction in an interpretable and useful way. The fol- a similar context, another type of data limitation plays a lowing subsection proposes various techniques for each one role in the overall uncertainty of the prediction: case com- of these elements in detail. pleteness. For example, predictive model A estimates the academic risk of failing a course based on number of other 3.1 Predicted Risk Value courses taken at the same time and the GPA of the student; The value of the academic risk of a student, being just a predictive model B estimates the academic risk of failing a scalar that can be expressed as an easily interpretable nu- course based on the number of courses taken at the same meric value between 0 and 1 (as probability) or from 0% to time, the GPA of the student, the fact that the student 100% (as relative frequency) can be presented using a large has an external job, if the student is married, the number variety of visualization techniques such as textual, progress of children the student has, the distance from his house to arc, gauge or bullet graphs. Figure 1 shows an example of the university and the number of courses taken before the this type of visualizations. Attached to the visualization current one. Both models estimate the academic risk of fail- of the value, all of these types of visualization present the ing the course as the percentage of similar students that decision-maker with a pre-defined guide to assess the level have failed the course in the past. A hypothetical predic- of risk described depending on the magnitude of the value. tion power analysis shows that model B is less uncertain In the case of textual and arch representations, the color that model A. However, this prediction power is calculated of the text or the arch (e.g. green, yellow and red) or an for the general population, for some students model A could additional iconic representation (e.g. traffic light) could be be less uncertain than model B. Lets suppose that student A used to provide an indication of the severity of the risk. In is taking 3 other courses, has a GPA of 3.5, has an external the case of the gauge and bullet graphs, different ranges can job, is married, has 5 children, lives 100 km from the uni- be color-coded to also provide this information. Some previ- versity and has taken just one course before the current one. ous implementations of visualization of academic risk, such Lets suppose too that this is a very unusual combination of as Signals [1], use only an iconic representation (the traffic values for the students of this specific course. If the model A light approach) to represent the predicted value. Repre- is applied, only the number of other courses that the student senting only the range in which the value is, instead of the is currently taking (3) and his or her GPA (3.5) are consid- actual value is used to account for the uncertainty of the ered. These two values, by themselves, are not unusual, so it prediction. However, in most cases, those ranges are crisp, is probable that there will be several previous students that meaning that a single unit change in the predictive value can could be considered similar. The prediction of academic risk cause the color to change, defeating the purpose of present- for the hypothetical student will be drawn from a large pool ing only ranges in the first place. For example, a student of previous experiences. If the model B is applied, due to with a risk of 0.50 will be coded with green, while a student the unusual values of the rest of variables, the model could with a risk of 0.51 will be coded yellow. With just the iconic only find one other student close enough to be considered representation, there is no way for the decision-maker to es- similar in the dataset. In this situation, the prediction of tablish if the students is closer to green or to red. Moreover, academic risk for the student will be 100%, if the previous the span of the ranges (what values are considered to be student failed the course or 0% if he or she passed. While, green, yellow or red) is often also unknown to the decision- in general, model B has more predictive power than model maker. Using only the iconic representation is discouraged A, for this particular student the approximate estimation given that this work present other ways to deal with the in- of model A will be much more less uncertain than the one herent uncertainty in the prediction. provided by model B, due to the lack of similar cases in the dataset. The prediction for “outlier” students, that is, stu- dents that have few similar students in the dataset, is less certain than the prediction for “mainstream” students, that has a large collection of similiar cases. Simple models have 3.2 Model Predictive Power Similarly to the predicted risk value, the model predictive less similarity dimensions, and the number of possible cases power is also an scalar magnitude. Contrary to risk probabil- is lower than in complex models with larger dimensions sets. ity, the meaning of the output of the different model-scoring The variety and quantity of cases in the dataset, that is the techniques (such as R-squared, BIC, AIC, Brier score, etc.) case completeness of the dataset, introduce a uncertainty are far from being easy to interpret by non-statisticians. To factor that varies from student to student and depends on effectively communicate the predictive power of the model, the complexity of the model. or what is the same, the level of uncertainty that a given model will introduce in the prediction, the expert analyst 3. VISUALIZING UNCERTAINTY in charge of the academic risk prediction should define a set As mentioned in the introduction, the visualization of uncer- of iconic representations (e.g. traffic lights, happy-sad faces, tainty is already an established feature in more mature fields. plus signs, etc.) to correspond with different values of pre- In Visual Learning Analytics, however there are still no thor- dictive power. Given that usually there are no model with oughly evaluated techniques. The most recommended path “bad” power (otherwise it will not be used in the analysis), it in this case will be to adapt uncertainty visualization tech- is recommended that a plus signs textual representation (“+” Figure 3: Data consistency visualization for a course historical data In most predictive models is easy to obtain a measure of how many “similar” elements are considered at the moment of ob- taining the predictive value for a given element. In the case of academic data, the case completeness could be measured Figure 1: Predicted value visualization: a) tex- as the number of records that are directly used to calculate tual representation, b) progress arc graph, c) gauge the academic risk of a given student. This number could go graph and d) bullet graph from 0 to the total number of records in the dataset. A low value is an indication of a high uncertainty in the predicted value. Higher values, usually larger than 30, are enough for lower scoring models, “++” for medium scoring models to discount the number of cases as a source of uncertainty. and “+++” for the best scoring models) is used to represent The recommended visualization technique for this value is an different levels of power. The words “Good”, “Very Good” iconic representation with icons that represent alert states at and “Excellent” could be complement or replace this visual- different number of different cases pre-defined by the expert ization. An example of this visualization could be seen in behind the analysis (e.g. a red stop sign for values between Figure 2. 0 and 5, a yellow exclamation mark for values between 5 and 30 and a green check for values higher than 30). To- It is important to note that this visualization is only nec- gether with the icon, a textual representation of the number essary when the decision-maker can select between different of cases could be included to improve understandability (e.g. models or the system chooses the model based on the avail- This prediction is based only on 3 previous cases). Figure 4 able data. If the predictive risk is using a single model, the presents an example of this visualization. value of presenting this extra information is diminished. Figure 2: Model predictive power visualization Figure 4: Case completeness visualization based on 3.3 Data Consistency iconic representation The representation of uncertainty introduce by the data in- consistency is challenging given that there is no way to pre- 3.5 Interaction cisely measure it. In the case of academic datasets, the con- The visualization described in the previous sub-section could sistency is related to the changes in different aspects of the help the decision-maker to better understand the inherent study program or course over time. It is expected that the uncertainty of the risk value prediction. However, if the closer in time the historic data is, the greater the level of con- decision-maker is not confortable with the uncertainty of sistency and the lower the level of uncertainty. If there exists the prediction the only course of action is to discard the a record of major changes in the academic program (course prediction. As mentioned in Section 2, the uncertainty of changes, evaluation policies changes, etc) or the courses (syl- the prediction depends on several factors such as the model labus change, pre-requisites changes, instructor change, etc), used, the length of historical data used and the number of they can be plotted in a timeline that span over the whole similar cases used by the model to generate the prediction data range of the historical data. In this way, instructors for a given student. The trade-off between these parame- and counselors that are familiar with the history of the pro- ters is decided by the expert in charge of the prediction. gram or course could recognize the changes and adjust their Usually the model selected will be the one with greatest perception of the uncertainty introduced in the prediction, predictive power and the range of historical data will be se- while students or users not familiar with the history of the lected to maximize this number. This selection is bound to program or course could just count the number of changes be sub-optimal for some students, specially those with spe- to form their own estimation of the uncertainty in the pre- cial cases. The use of interactive visualization transfer the diction, although less precise than the ones with previous control of the analysis parameters to the decision-maker. He knowledge. An example of this type of visualization can be or she could adjust them in order to reach the lowest level seen in Figure 3 of uncertainty possible for a given student and the domain knowledge that the decision-maker has about the academic 3.4 Case Completeness program or course. Very simple interactive controls could be added to the vi- was used to create 10 clusters). The semesters were clustered sualization in order to control the main parameters affect- at five levels (all using Fuzzy C-means): Level 1, based on ing uncertainty factors. Each time a new value is selected the total load of the courses calculated from their difficulty on those controls, the uncertainty visualizations should be [9]; Level 2, based on the typology of courses; Level 4, based updated enabling the exploration of the uncertainty space on the grades that the students obtain in the courses [9]; by the decision-maker. To control the uncertainty resulting Level 4, based on the knowledge area of the courses; Level from the predictive power of the model, the decision-maker 5, based on the actual name of the courses. The intersection could be presented with a set of widgets where the model of the level student and semester clustering defines a pre- algorithm or parameters could be selected. To control the dictive model. For example, a model is created by finding uncertainty resulting from the lack of consistency in the his- similar students based on their GPA taking similar semesters torical records, the timeline where this information is pre- based on the difficulty of courses taken (Level 2). The pre- sented could be complemented with a selection bar to select dictive power of the models was obtained computing the subsets of the whole time period. The uncertainty produced Brier score [16] of the forecast made for the last semester by the lack of similar cases could not be affected directly, (2013-2) with the models built from the data from all the but it will change its response to the changes in the model previous semesters. used and the selected time period. 4.3 Visualizing the Prediction 4. CASE-STUDY: RISK TO FAIL Figure 5 presents the interactive visualization created for To illustrate the ideas presented in the previous sections, the case-study academic risk prediction application. All the they will be applied to a real-world academic risk prediction elements discussed in Section 3 are present. The predicted application. This application is part of a larger counseling value is presented using a bullet graph with a 0%-100% scale, system used regularly by professors and students at a mid- a yellow interval between 50% and 75% and a red interval be- size university in Ecuador. The goal of this application is tween 75% and 100%. The model prediction power is shown to determine the academic risk of failing at least in the next with an iconic representation of one, two or three plus signs, semester based on the planned course selection and study together with a textual description. The data consistency is load. To produce this prediction the application uses a va- represented with an interactive timeline indicating the major riety of models that cluster the student and the planned events that changed the Computer Science program during semester with similar students and semesters in the histor- the analyzed period. The case completeness of the dataset ical dataset. The models calculate the risk based on the for the target student is presented using an iconic represen- previous frequency of similar students in similar semesters tation of group of different amounts of people related to a that failed at least one course. The counselor could interact color (one individual in red to indicate a large amount of with the visual analysis by selecting the courses that the stu- uncertainty, few people in yellow to represent middle values dent will take the next semester, the type of clustering that and a green crowd to represent low values. Finally, selection is applied to select similar students and semesters and the boxes are presented to the decision-maker to define the levels time period used to obtain similar cases. The counselor is of clustering (for students and semesters) that determine the presented with a prediction of the probability of the student model that will be used for the prediction. All of these vi- failing the course and the visualization of the uncertainty sualizations and controls are implemented with easy-to-use produced by the model, the data consistency and case com- D3 Javascript visualization library 1 . pleteness. The counselor use the information received to recommend the student to take more or less study load in the coming semester. 5. CONCLUSIONS AND FURTHER WORK Visualizing the uncertainty in the prediction of academic 4.1 Dataset risk, specially in an interactive way, has the potential to im- prove the usefulness of this type of systems. Even simple The dataset used for this application was built based on techniques are able to present to the decision-maker with a Computer Science program at the target university. All the information needed to assess the uncertainty of the pre- the courses taken by CS students each semester and the diction for different selections of model and historical train- grades obtained in those courses were stored since the first ing data. With an interactive visualization the decision- semester of 1978 to the second semester 2013. The courses maker, with their domain-expertise knowledge, becomes a that have changed name were grouped together according to co-designer of the analytic process, instead of a simple user the transition rules during those changes. A total of 30.929 of the results of the analysis. Implementing this visualiza- semesters were taken by 2.480 different students. tion in real-world scenarios is simple given that the sources of uncertainty are well understood and could be measured 4.2 Predictions Models or estimated. A multi-level clustering approach was used to build differ- ent models to find similar students and calculate the aca- The main task to be completed in this research is the real- demic risk value. Two main variables controlled the gener- world evaluation of the visualization to establish the answers ation of the different models: the student similarity and the to two main questions: 1) Is the visualization contributing semester similarity. The students were clustered at three to the understanding of the inherent uncertainty of the pre- levels: No clustering at all (all the students were considered diction of academic risk? and 2) Is the knowledge about similar), clustering based on GPA values (five clusters based the uncertainty helping the decision-maker to make better on range) and clustering based on similarity of grades in the 1 different courses (the Fuzzy C-means (FCM) algorithm [2] D3.JS visualization library - http://d3js.org Figure 5: Example of visualization integrated in the counseling system: 1) Course selector, 2) Predicted academic risk value visualization, 3) Model selector, 4) Time period selector and consistency visualization, 5) Model predictive power visualization and 6) Case completeness visualization decisions or to provide better advice? To answer these ques- Onderzoek - Vlaanderen (FWO) in Belgium through the tions, the tool presented in the case study will be used in funding of the “Managing Uncertainty in Visual Analytics” two experimental groups of counselors. One group will see project, to which this work belongs. the prediction and the uncertainty visualization. The second group will see only the prediction visualization. A third con- trol group will continue to use the counseling system without 7. REFERENCES the academic risk predictor application. The average failure [1] K. E. Arnold and M. D. Pistilli. Course signals at rate for each counselor will be recorded at the end of the purdue: Using learning analytics to increase student semester and compared with the failure rate between ex- success. In Proceedings of the 2nd International perimental and control group and also with the failure rate Conference on Learning Analytics and Knowledge, from previous semesters. Surveys will be conducted just af- pages 267–270. ACM, 2012. ter the counseling sessions in order to establish the level of [2] J. C. Bezdek. Pattern recognition with fuzzy objective understanding of the uncertainty in the prediction. function algorithms. Kluwer Academic Publishers, 1981. Finally, the ideas presented in this paper could be adapted [3] S. Deitrick and R. Edsall. The influence of uncertainty to other types of Visual Learning Analytics tools, especially visualization on decision making: An empirical those focused on prediction and forecasting. The methodol- evaluation. Springer, 2006. ogy followed in this paper could be a general framework for [4] G. W. Dekker, M. Pechenizkiy, and J. M. these adaptations: 1) exploring the main sources of uncer- Vleeshouwers. Predicting students drop out: A case tainty in the analysis, 2) establishing methods to measure study. In International Conference on Educational or estimate the uncertainty contribution of those sources, 3) Data Mining (EDM). ERIC, 2009. using existing visualization techniques to present the uncer- [5] C. Demmans-Epp, S. Bull, and M. Johnson. tainty values in a way that will be easy to interpret by the Visualising uncertainty for open learner model users. end-user, 4) provide control to the end-user through interac- In CEUR Proceedings associated with UMAP 2014, tive visualizations to change the parameters to the models 2014. and to select the desired data and 5) evaluate the impact [6] R. Ferguson. Learning analytics: drivers, of the visualization. Visualizing the uncertainty is a way to developments and challenges. International Journal of empower the user of Visual Learning Analytics tools, stress- Technology Enhanced Learning, 4(5):304–317, 2012. ing that automatic analysis could support, but not replace, [7] A. Fortenbacher, L. Beuster, M. Elkina, L. Kappe, human judgment. A. Merceron, A. Pursian, S. Schwarzrock, and B. Wenzlaff. Lemo: A learning analytics application 6. ACKNOWLEDGMENTS focussing on user path analysis and interactive The author wants to acknowledge the contribution of Secre- visualization. In Intelligent Data Acquisition and tarı́a Nacional de Educación Superior, Ciencia y Tecnologı́a Advanced Computing Systems (IDAACS), 2013 IEEE (SENESCYT) in Ecuador and the Fonds Wetenschappelijk 7th International Conference on, pages 748 – 753, 2013. [8] D. Gomez, C. Suarez, R. Theron, and F. Garcia. Advances in Learning Processes, chapter Visual Analytics to Support E-learning. InTech, 2010. [9] G. Méndez, X. Ochoa, and K. Chiluiza. Techniques for data-driven curriculum analysis. In Proceedings of the Fourth International Conference on Learning Analytics And Knowledge, LAK ’14, pages 148–157, New York, NY, USA, 2014. ACM. [10] M. C. Politi, P. K. Han, and N. F. Col. Communicating the uncertainty of harms and benefits of medical interventions. Medical Decision Making, 27(5):681–695, 2007. [11] C. Rampell. Colleges mine data to predict dropouts. The chronicle of higher education, 54(38):A1, 2008. [12] J. L. Santos, K. Verbert, S. Govaerts, and E. Duval. Addressing learner issues with stepup!: an evaluation. In Proceedings of the Third International Conference on Learning Analytics and Knowledge, pages 14–22. ACM, 2013. [13] D. Spiegelhalter, M. Pearson, and I. Short. Visualizing uncertainty about the future. Science, 333(6048):1393–1400, 2011. [14] J. Thomas and P. C. Wong. Visual analytics. IEEE Computer Graphics and Applications, 24(5):0020–21, 2004. [15] J. Thomson, E. Hetzler, A. MacEachren, M. Gahegan, and M. Pavel. A typology for visualizing uncertainty. In Electronic Imaging 2005, pages 146–157. International Society for Optics and Photonics, 2005. [16] D. S. Wilks. Statistical methods in the atmospheric sciences, volume 100. Academic press, 2011.