Designing Intelligent Systems for Online Education: Open Challenges and Future Directions Danilo Dessì2,3 , Tanja Käser1 , Mirko Marras1 , Elvira Popescu4 and Harald Sack2,3 1 EPFL, Switzerland 2 FIZ Karlsruhe – Leibniz Institute for Information Infrastructure, Germany 3 Karlsruhe Institute of Technology, Institute AIFB, Germany 4 University of Craiova, Romania Abstract The design and delivering of platforms for online education is fostering increasingly intense research. Scaling up education online brings new emerging needs related with hardly manageable classes, over- whelming content alternatives, and academic dishonesty while interacting remotely, as examples. How- ever, with the impressive progress of the data mining and machine learning fields, combined with the large amounts of learning-related data and high-performance computing, it has been possible to gain a deeper understanding of the nature of learning and teaching online. Methods at the analytical and algorithmic levels are constantly being developed and hybrid approaches are receiving an increasing attention. Recent methods are analyzing not only the online traces left by students a posteriori, but also the extent to which this data can be turned into actionable insights and models, to support the above needs in a computationally efficient, adaptive and timely way. In this paper, we present relevant open challenges lying at the intersection between the machine learning and educational communities, that need to be addressed to further develop the field of intelligent systems for online education. Several areas of research in this field are identified, such as data availability and sharing, time-wise and multi-modal data modelling, generalizability, fairness, explainability, interpretability, privacy, and ethics behind mod- els delivered for supporting education. Practical challenges and recommendations for possible research directions are provided for each of them, paving the way for future advances in this field. Keywords Education, E-Learning, MOOC, Online Courses, Learning Analytics, Machine Learning, Data Mining. 1. Introduction The increasing demand for skilled professionals is fostering competition amongst companies and institutions interested in securing the best candidates [1]. Being successful along such a competitive professional path often depends on the individual’s ability of continuously acquiring knowledge and mastering skills relevant for the position under consideration. Online education is playing a crucial role to instill knowledge and skills to life-long learners, acting as an ecosystem that bridges individuals (e.g., learners, teachers), resources (e.g., videos, slides), technologies (e.g., platforms, devices, tools), cultural habits (e.g., community sharing), and policy-making L2D’21: First International Workshop on Enabling Data-Driven Decisions from Learning on the Web, March 12, 2021, Jerusalem, IL " danilo.dessi@fiz-karlsruhe.de (D. Dessì); tanja.kaeser@epfl.ch (T. Käser); mirko.marras@acm.org (M. Marras); elvira.popescu@edu.ucv.ro (E. Popescu); harald.sack@fiz-karlsruhe.de (H. Sack) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) strategies (e.g., business models, learning goals). Learning institutions and training providers are encouraged to deliver teaching online, thanks to its technical, economic, and operational feasibility [2]. On the other hand, students are taking advantage of the flexibility, accessibility, and often lower costs of learning online. This win-win situation has led to a proliferation of online initiatives with thousands of students and teachers [3, 4]. Scaling up online education towards these numbers is posing key challenges, such as hardly- manageable classes, low-level individualized support, overwhelming content alternatives, and academic dishonesty, that have a fundamental impact on the quality of learning and teaching [5]. Capitalizing on large amounts of learning-related data and high performance computing, the rapid evolution of data mining and machine learning is making it possible to devise solutions that can mitigate the above challenges [6]. For instance, intelligent systems for automated content tagging can support teachers during such an error-prone task [7, 8]. Moreover, biometric recognition can help to ensure academic integrity [9]. Clickstream analysis and early warning systems can support teachers in identifying and timely acting upon risk factors of dropping a course [10]. These opportunities have generated a wide interest among researchers, educators, policy makers, and businesses. To turn these opportunities into real-world actionable insights and models, designing, developing, and assessing them are essential steps. Several high-quality review papers have provided an overview of the recent advances in data mining and machine learning for education over the last years. For instance, Koedinger et al. [11] discussed how data mining has been leveraged to shed light on the psychology of learning from different perspectives, such as assessment, model discovery, affect role, motivation, meta- cognition, and collaborative learning. Salloum et al. [12] further summarized the way data mining supported the most recent trends in educational research and how machine learning has been adopted in the field of education. Romero and Ventura [13] discussed relevant papers, the experimental pipeline, the educational environments, the tools, the data sets, and the main tasks where data mining and machine learning have been used in education. Hernández-Blanco et al. [14] specifically focused on the research in deep learning applied to educational data. These works have greatly helped us in summarizing the progress in the field. In contrast to the above studies, this paper does not aim to provide an exhaustive review of existing methodologies, but to summarize and discuss a range of open challenges and future directions in intelligent systems for online education, from the data mining and machine learning community perspective. These points have emerged from the discussion carried out during the "First International Workshop on Enabling Data-Driven Decisions from Learning on the Web (L2D 2021)". Specifically, this paper identifies important challenges that need to be addressed to gain a deeper understanding of this vast field, and discusses emerging topics and contemporary applications that require further research, especially at large scale. Ten fundamental areas are identified and open challenges in each of them are highlighted, including data availability and sharing, time-wise and multi-modal data modelling, generalizability, fairness, explainability, privacy, and ethics. Together with a discussion, we present promising research directions that should be examined to turn these challenges into opportunities. 2. Field Analysis and Discussion In what follows, we present ten action areas with respect to the application of data mining and machine learning in online education, emphasizing challenges and possible future directions. Area 1: Data Collection and Sharing. The reuse of educational data has the potential to impact research in this field. Research data reuse can only be achieved through data sharing and can bring benefits to discovery and innovation, such as the ability to ask new research questions on the same data, re-examine existing methods and models, and enable replicable and reproducible research. While the availability of data has increased with the proliferation of educational data repositories (such as PSLC DataShop [15]), there is still an intensive discussion on how to handle data sharing and reusability outside of the training and educational institutions, schools, research centers and laboratories where the data was originally collected. Legal and ethical frameworks regulate data access in these contexts, and getting approval from the corresponding boards to access such data presents challenges even for local researchers. In addition to this, collecting data from a large number of participants over a course segment long enough to provide evidence of learning is challenging. Consequently, educational data sets are often small and do not allow to develop advanced and generalizable machine learning models. Furthermore, as education is a heterogeneous field, data covering diverse educational contexts would be needed to support cross-context analyses. Besides data, the availability, re-use, and sharing of other artifacts, e.g., source code and pre-trained models, would require further exploration to become a common practice and promote research. For instance, these open science practices and their adoption in education were discussed in [16]. Area 2: Multi-Modal Analysis and Modelling. Understanding and optimizing educational paths in the real world is urging to meaningfully capture data pertaining to multiple interaction modalities. It becomes crucial to extend current research with contributions on new services and analyses driven by multi-modal data, extract insights from this complex multi-modal data across different educational environments, and shape high-quality, effective, and timely multi-modal feedback for students and teachers. Given the heterogeneity of educational data, this scenario brings some unique challenges during the development of data mining and machine learning models able to interpret and reason about multi-modal educational interactions. For instance, it is becoming important to learn how to represent and summarize multi-modal data in a way that exploits the complementarity and redundancy of multiple modalities. Mapping data from one modality to another to ease combination and identifying direct relations between elements from two or more modalities also require further research. Further studies would be needed to better understand how to deal with the transfer of knowledge between modalities, their representation, and modelling, to perform a prediction. Challenges in this area were also highlighted by [17, 18], for instance. Area 3: Time-Wise Analysis and Modelling. The increasing adoption of data tracking and logging methods in online educational platforms has made it possible to collect student’s traces across weeks, months, and even years. The growing amount of time series data has fostered research on this data type in education, giving rise to new methods for representing, indexing, clustering, and classifying time series (e.g., [19, 20]). However, analyzing time series is often considered as one of the most challenging problems in educational data mining and machine learning. Sampling time series data usually requires to make multiple design choices, such as regarding the frequency (e.g., per day or per week). Segmenting user’s sessions is also hard due to the fact that it is often unknown whether and why users have stopped their sessions. Time series data is rarely fed directly into models and therefore an additional time- and cognitive- consuming feature engineering task is required, bringing potential shortcomings related to the loss of relevant information for the task under consideration. Manipulating features extracted for defined time frames also calls for models able to work on an additional data dimension. Unfortunately, several models cannot be directly extended to deal with the time dimension and, hence, features are often vectorized or averaged across time at training and prediction stages. Area 4: Generalizability and Transferability. In this area, several challenges would lie around further investigation of whether research findings uncovered in a specific educational environment can be generalized in other different contexts. By extension, further explorations would be needed on the extent to which machine learning models trained with data from a given context can generalize well to other contexts. Therefore, identifying educational patterns that can be observed in different contexts and developing features and models with performance generalizable across contexts would be a driver of future research. Transfer learning has the potential to be one of the effective techniques to deal with this aspect, as it exploits knowledge present in data from a source context to enhance a model in a target context with little data availability [21, 22]. This strategy is promising to greatly reduce the cost and effort of collecting sufficient data to create an effective model in a new target context. When the source and target contexts differ also in the feature space, data distribution, and label space, other challenges arise to fill in these data and representation gaps present in the cross-context learning task. Area 5: Interpretability and Explainability. Data mining and machine learning approaches have shown promising performance in education. To make these approaches ready for the real world, their interpretability and explainability have become pressing issues. For instance, challenges in this area often point to how we can explain why data-driven decisions go wrong, and if data-driven decisions are accurate, why and how to leverage them further. Research in the general-purpose machine learning field has suggested measures and frameworks to capture interpretability and explainability and the topic of explainable machine learning has become prominent. Popular libraries have started to provide or include their interpretability and explainability tools. Furthermore, the proliferation of interpretability and explainability assessment criteria (e.g., reliability, causality) would support our understanding of how models make decisions and how they can be improved. Interpreting and explaining the decisions made by models, uncovering the patterns within the inner mechanisms of a model, and empowering educational platforms with explainable models would be crucial to raise the credibility of intel- ligent systems in online education. The need of interpretable models in students’ performance prediction was for example discussed by [23]. Area 6: Personal Privacy. There exists an important trade-off between privacy and personal- ization. User modeling in education has often significant privacy implications because personal data about users (e.g., students and teachers) needs to be collected to adapt platforms to individ- uals. Educational entities have to comply with privacy, policy and legal issues when collecting, storing, analyzing, and disclosing potentially identifiable information from students for data mining and machine learning. Privacy perspectives in learning analytics were discussed by [24], for instance. With this in mind, further explorations on frameworks that allow users to know what information will be disclosed and how much control they have over it would be needed. Furthermore, adversarial machine learning models might be able to extract sensitive personal attributes from anonymized data. When such information is extracted and used without the users’ consent, then issues of function creep and privacy infringement emerge. The extent to which data and models in education suffer from these issues would need to be explored deeply. Similarly, unmasking the identity of a person by linking information from disparate educational sources would represent a privacy breach against which further research would be needed. Exploring the notion of controllable privacy, where specific sensitive attributes cues are suppressed in the data, without compromising the quality of the data for the original tasks it was used, would also open up to further research on privacy-preserving educational systems. Area 7: Bias and Fairness. The massive adoption of techniques, algorithms, models, and tools empowered with data-driven decisions brings into question the fairness and integrity of the underlying educational platforms. Indeed, data-driven approaches can be vulnerable to biases inherent to the data and further research would be needed to investigate the extent to which algorithms and models emphasize these biases and potentially lead to unfair outcomes for certain individuals or groups in online educational platforms. Biased outcomes could be introduced by using data which is not an accurate sample of the population or is influenced by socio-cultural stereotypes. It also remains under-explored how these undesired effects can be mitigated in the context of educational systems that are increasingly deployed in heterogeneous populations worldwide. For instance, to understand how the geographic provenience of learners and teachers can affect fairness in education, was highlighted as an open research challenge in [25]. Determining the underlying causes for biases in educational machine learning models and designing methods that alleviate this problem would represent a core objective. In view of this, assembling large multi-modal educational datasets exhibiting demographic diversity represent a crucial task to meet that objective. Area 8: Ethics. Educational machine learning models often recommend actions based on evidence coming from student’s interactions. This workflow gives rise to a number of social and ethical concerns. In response to this, further research would be needed on the way these actions are shared with those who can benefit from them, such that they are benefits rather than harms. For instance, this could imply to develop novel ways of ensuring that any predictive model used to make consequential decisions about students is ethically and responsibly applied in an online educational platform. By extension, this points to the accountability of data mining and machine learning in educational platforms, with stakeholders being able to account for the evidence coming from data-driven predictions and suggestions. Empowering educational stakeholders with these capabilities would require advances on the assessment of the validity of a data mining technique or a machine learning model, going beyond its accuracy, involving stakeholders to ensure adequacy and ethics. For instance, ethical and social impacts in learning analytics in general and for digitally mediated assessment specifically were presented in [26, 27]. Area 9: Multi-Sided Modelling. Educational systems often provide personalized information access, especially when the volume of the content would otherwise be overwhelming. In research contexts, these systems are typically evaluated on their ability to provide interventions that satisfy the needs and interests of the end user, usually students. Students would not make use of an educational system if they believed such systems were not providing interventions that match their needs. However, it is also clear that the end user for whom interventions are generated is not often the only stakeholder in the pipeline. Other users, the providers of resources, usually teachers, and even the system’s own objectives may need to be considered, leading to a multi-sided environment. For instance, this perspective was discussed in the context of educational recommendations by [28]. Incorporating the perspectives and utilities of multiple stakeholders into the decision-making process would require further research in intelligent systems for education empowered with data mining and machine learning techniques. Area 10: Offline and Online Evaluation. In data mining and machine learning, it is often easiest to perform offline experiments using existing data sets and a protocol that models user behavior to estimate performance measures, such as prediction accuracy. A more expensive option is to run user studies, where a small set of users is asked to perform tasks using the system, typically answering questions afterwards about their experience. While this evaluation is easier to conduct, repeatable, fast and can incorporate arbitrary many models, it might not reflect well the true utility of models as seen in the real world. Indeed, the final goal is often to measure the change in user behavior and learning outcomes after the related data mining technique or machine learning model has been introduced. An online real-world evaluation at large scale is able to naturally incorporate current context, tasks or needs of the user, but is time consuming, the necessary time scales linearly with the number of evaluated approaches and it can even harm reputation if bad decisions or interventions are shown. Therefore, further research on protocols that bridge offline, user studies, and online evaluation would be expected. Other Research Challenges. Besides the aforementioned areas, a number of other current research challenges has emerged. These include: (a) designing novel sensors for acquiring learning-related data from onlife learning spaces transparently; (b) designing robust feature extraction and matching algorithms that can successfully operate on poor quality data; (c) information fusion techniques for combining different types of data, performance measures, and social information; (d) discovering and mitigating the impact of adversarial techniques that can destabilize educational machine learning models; (e) models for predicting learning and teaching behavior in large-scale systems having thousands of students and teachers; (f) incorporating cognition in model design, such as perception, emotion, and cognitive thinking; (g) federated machine learning models across different data systems to decentralize the source of educational data; (h) visualizations that communicate the output of machine learning models to support awareness, self-reflection, and self-assessment; (i) models for automatic or semi-automatic scoring, automatic evaluation of free text answers, automatic issuing of badges. 3. Conclusion In this paper, we covered different perspectives concerning the adoption of intelligent systems in education and identified some of the current research challenges rooted in modern applications and lying at the intersection between data mining, machine learning, and education. Despite the impressively intense and high-quality research, many new and emerging challenges are requiring attention and continuous development, including those on data availability and sharing, time- wise and multi-modal data modelling, generalizability, fairness, explainability, privacy, and ethics. There are also many other challenges and directions, not explicitly mentioned in this paper, to be investigated in this application area. We hope that the challenges and directions highlighted in this paper can inspire advances in intelligent systems for online education. References [1] F. M. Malloci, L. P. Penadés, L. Boratto, G. Fenu, A text mining approach to extract and rank innovation insights from research projects, in: Proceedings of the 21st Inter. Conf. Web Information Systems Engineering WISE 2020, volume 12343, Springer, 2020, pp. 143–154. [2] V. Arkorful, N. Abaidoo, The role of e-learning, advantages and disadvantages of its adoption in higher education, International Journal of Instructional Technology and Distance Learning 12 (2015) 29–42. [3] D. Shah, ClassCentral Reports: By The Numbers - MOOCs in 2020, 2020. Available at https://www.classcentral.com/report/mooc-stats-2020/. [4] S. Palvia, P. Aeron, P. Gupta, D. Mahapatra, R. Parida, R. Rosner, S. Sindhi, Online education: Worldwide status, challenges, trends, and implications, 2018. [5] M. Schophuizen, K. Kreijns, S. Stoyanov, M. Kalz, Eliciting the challenges and opportunities organizations face when delivering open online education: A group-concept mapping study, The Internet and Higher Education 36 (2018) 1–12. [6] M. Manca, L. Boratto, S. Carta, Behavioral data mining to produce novel and serendipitous friend recommendations in a social bookmarking system, Inf. Syst. Fron. 20 (2018) 825–839. [7] Z. Kastrati, A. S. Imran, A. Kurti, Integrating word embeddings and document topics with deep learning in a video classification framework, Pattern Recogn. Lett. 128 (2019) 85–92. [8] L. Boratto, S. Carta, E. Vargiu, RATC: A robust automated tag clustering technique, in: E-Commerce and Web Technologies, 10th International Conference, EC-Web 2009, volume 5692, Springer, 2009, pp. 324–335. [9] C. Rathgeb, K. Pöppelmann, E. Gonzalez-Sosa, Biometric technologies for elearning: State-of-the-art, issues and challenges, in: Proc. of the 18th International Conference on Emerging eLearning Technologies and Applications, IEEE, 2020, pp. 558–563. [10] B. Prenkaj, P. Velardi, D. Distante, S. Faralli, A reproducibility study of deep and surface machine learning methods for human-related trajectory prediction, in: Proc. of the 29th ACM Intern. Conference on Information & Knowledge Management, 2020, pp. 2169–2172. [11] K. R. Koedinger, S. D’Mello, E. A. McLaughlin, Z. A. Pardos, C. P. Rose, Data mining and education, Wiley Interdisciplinary Reviews: Cognitive Science 6 (2015) 333–353. [12] S. A. Salloum, M. Alshurideh, A. Elnagar, K. Shaalan, Mining in educational data: review and future directions, in: Joint European-US Workshop on Applications of Invariance in Computer Vision, Springer, 2020, pp. 92–102. [13] C. Romero, S. Ventura, Educational data mining and learning analytics: An updated survey, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 10 (2020) e1355. [14] A. Hernández-Blanco, B. Herrera-Flores, D. Tomás, B. Navarro-Colorado, A systematic review of deep learning approaches to educational data mining, Complexity 2019 (2019). [15] K. R. Koedinger, R. S. Baker, K. Cunningham, A. Skogsholm, B. Leber, J. Stamper, A data repository for the edm community: The pslc datashop, Handbook of educational data mining 43 (2010) 43–56. [16] J. I. Fleming, S. E. Wilson, S. A. Hart, W. J. Therrien, B. G. Cook, Open accessibility in education research: Enhancing the credibility, equity, impact, and efficiency of research, Educational Psychologist (2021) 1–12. [17] D. D. Mitri, J. Schneider, M. Specht, H. Drachsler, The big five: Addressing recurrent multimodal learning data challenges, in: Proc. of the 2nd Multimodal Learning Analytics Across (Physical and Digital) Spaces, CrossMMLA@LAK 2018, volume 2163, CEUR, 2018. [18] S. Oviatt, Ten opportunities and challenges for advancing student-centered multimodal learning analytics, in: Proceedings of the 20th ACM International Conference on Multi- modal Interaction, 2018, pp. 87–94. [19] L. Haiyang, Z. Wang, P. Benachour, P. Tubman, A time series classification method for behaviour-based dropout prediction, in: 2018 IEEE 18th international conference on advanced learning technologies (ICALT), IEEE, 2018, pp. 191–195. [20] S. Shen, M. Chi, Clustering student sequential trajectories using dynamic time warping., International Educational Data Mining Society (2017). [21] J. Lagus, K. Longi, A. Klami, A. Hellas, Transfer-learning methods in programming course outcome prediction, ACM Transactions on Computing Education (TOCE) 18 (2018) 1–18. [22] M. Ding, Y. Wang, E. Hemberg, U.-M. O’Reilly, Transfer learning using representation learning in massive open online courses, in: Proceedings of the 9th international conference on learning analytics & knowledge, 2019, pp. 145–154. [23] M. Chitti, P. Chitti, M. Jayabalan, Need for interpretable student performance prediction, in: Proceedings of the 13th Int. Conf. on Developments in eSystems Engineering, 2020. [24] K. M. Jones, A. Asher, A. Goben, M. R. Perry, D. Salo, K. A. Briney, M. B. Robertshaw, “we’re being tracked at all times”: Student perspectives of their privacy in relation to learning analytics in higher education, Journal of the Association for Information Science and Technology 71 (2020) 1044–1059. [25] E. Gómez, L. Boratto, M. Salamó, Disparate impact in item recommendation: A case of geographic imbalance, in: Proceedings of the 43rd European Conference on Information Retrieval, ECIR 2021, volume 12656, Springer, 2021, pp. 190–206. [26] P. Prinsloo, S. Slade, Ethics and learning analytics: Charting the (un) charted, Handbook of Learning Analytics (2017) 49–57. [27] M. Bearman, P. Dawson, J. Tai, Digitally mediated assessment in higher education: Ethical and social impacts, Re-imagining University Assessment in a Digital World (2020) 23–36. [28] H. Abdollahpouri, G. Adomavicius, R. Burke, I. Guy, D. Jannach, T. Kamishima, J. Krasnodeb- ski, L. Pizzato, Multistakeholder recommendation: Survey and research directions, User Modeling and User-Adapted Interaction 30 (2020) 127–158.