=Paper=
{{Paper
|id=Vol-1903/paper24
|storemode=property
|title=Application of data mining and process mining approaches for improving e-learning processes
|pdfUrl=https://ceur-ws.org/Vol-1903/paper24.pdf
|volume=Vol-1903
|authors=Katalina Grigorova,Elena Malysheva,Sergey Bobrovskiy
}}
==Application of data mining and process mining approaches for improving e-learning processes ==
Application of Data Mining and Process Mining approaches for improving e-Learning Processes K. Grigorova1, E. Malysheva2, S. Bobrovskiy2 1 Angel Kanchev University of Ruse, 8 Studentska str., Ruse 7017, Bulgaria 2 Volga Region State University of Services, 4 Gagarina str., 445677, Togliatti, Russia Abstract The article describes the basic principles and methods of Data mining and Process mining, their similarities and differences. The authors examine the research in Educational Data Mining field, associated with the use of Data mining techniques in education, give examples of problems to be solved with the use of Data mining and Process mining techniques in the area of traditional and e-learning, describe the possibilities and limitations of different methods. Some examples of special software for Data mining and Process mining are presented. A review of major scientific conferences and journals devoted to the research in Educational Data Mining is made. Keywords: Data Mining; Process Mining; Education Data Mining; e-Learning 1. Introduction Modern information systems have accumulated a huge amount of data about processes taking place in the various domain areas. Many of today's information systems, including e-Learning system, collect and store data about the events occurring during the systems’ performance in so-called event logs. Data mining and Process mining technologies allow the use of the event log data for analysis and improvement of the processes. Availability of advanced software dealing with Data mining and Process mining, allows to test these techniques on data obtained from real processes. A stimulus for the growing interest in Data mining and Process mining is the constant increase in the amount of data recorded in the information systems, including data about events that provide detailed information about the history of the processes, and the need to improve and support business processes in competitive and rapidly changing environment. Data mining and Process mining are complementary approaches that can reinforce each other. Process models detected and aligned with the event log data confirm the value of data analysis and provide a basis for further development as of Process mining, as well as of Data mining. 2. Data Mining and Process Mining: An Overview At the core of both methods (Process mining and Data mining) are the data. They have a lot in common, as they use the same mathematical algorithms and techniques. The main difference is that Data mining operates with the data in general, whilst Process mining works with the data about events, which contain information about the processes [1]. 2.1. Definitions and Methods of Data mining Data mining - a multidisciplinary area, which has arisen and developed on the basis of such science fields as applied statistics, artificial intelligence, pattern recognition, machine learning, algorithmization, database theory and others. Data mining might consist of the following steps: identification of patterns and associations (free search), the use of the association rules to predict unknown values (predictive analytics), identification and analysis of the exceptions in the identified rules (anomaly detection).Here are some definitions of the concept. Gartner Group, the agency that analyzes the information technology markets, defines Data mining as follows: “The process of discovering meaningful correlations, patterns and trends by sifting through large amounts of data stored in repositories. Data mining employs pattern recognition technologies, as well as statistical and mathematical techniques” [2]. SAS Institute, a developer of analytical software, mentions in his definition of big data and its practical usefulness: “Data mining is the process of finding anomalies, patterns and correlations within large data sets to predict outcomes. Using a broad range of techniques, you can use this information to increase revenues, cut costs, improve customer relationships, reduce risks and more” [3]. In the Data mining Curriculum [4] the following definition is met: “Data mining is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems”. Data mining methods and algorithms include: decision trees, symbolic rules, cluster analysis, nearest neighbor method, Bayesian networks, artificial neural networks, support vector machines, linear regression, correlation and regression analysis, association rules support, еvolutionary programming and genetic algorithms, a variety of methods for data visualization and many others. Most of the analytical methods used in Data mining technology are well-known mathematical algorithms and methods. New in their application is the possibility to use them in solving various concrete problems, due to existing appropriate hardware and software. 3rd International conference “Information Technology and Nanotechnology 2017” 115 Data Science / K. Grigorova, E. Malysheva, S. Bobrovskiy 2.2. The Basic Principles and Methods of Process mining Process mining is a relatively young research discipline. The idea of Process mining is to detect, control and improve the actual occurring processes by extracting knowledge from event logs readily available in modern information systems [1], [5]. Process mining sits between Big data and Data mining on the one hand, and Business Process Modeling and Analysis on the other. Large volumes of data that business generates, and deployment of business logic across all levels of the business, providing an opportunity for theoretical and practical research on these interrelated and topical areas. Applying the principles of Data science on various aspects of business processes represents a new approach to their modeling and management. More and more data about business processes is recorded by means of information systems in the form of so-called records of events (event logs), which can advantageously be used as an input information for business process models retrieval. Although the event data are available in the organizations, they often lack of understanding of their real-life processes. A knowledge hidden in event logs can be converted into useful management information. Process mining includes automated process detection (extraction the process models from event logs), conformance checking (monitoring deviations by comparing model and event logs), defining the organizational structure, automated construction of simulation models, model extension and recovery, the prediction of process behavior in order to develop recommendations on the basis of the process history. Although this technology has only been recently developed, it can be applied to any type of operational processes in different organizations and systems. Process mining techniques provide new means for detecting, monitoring and improvement of processes in various fields of application, offer opportunities for a stricter conformance checking and the validation and reliability of information about the basic processes of the organization. It is an important tool for modern organizations that need to manage non-trivial operational processes, since on the one hand, there is an incredible growth of event data, on the other hand, the processes should be aligned with the need for effective customer service. One of the main directions of modern Data mining application is Educational Data Mining (EDM). The main goal of EDM is to use the huge amount of data about the educational processes, coming from different sources in different formats and with different levels of detail. The data represents information about the educational process, provides better understanding of learning and improving its outcomes. 3. Data and problems in EDM Nowadays in the field of education there are a wide variety of educational environments and information systems. CBE (Computer-based education) refers to the use of computers in education to provide directed training to generate control instructions for the student. The first CBE systems are a stand-alone educational applications that work on your computer without the use of artificial intelligence for student modeling, adaptation, personalization, and so on. Global use of the Internet has led to development of many new Web based educational system, such as e-learning systems, distance learning systems, on- line training systems, and so on, and the increasing use of artificial intelligence has led to the emergence of new intelligent and adaptive educational systems. The main types of currently used systems include: LMS (Learning management systems) [7], ITS (Intelligent tutoring systems) [8], AIHS (Adaptive intelligent hypermedia systems) [9], Test and quiz systems [10] and others. Each of them provides a variety of data sources that need to be processed in different ways depending on the nature of the available data and the specific problems and tasks that are solved by using Data mining techniques. During Educational Data Mining researchers use data of educational systems such as distance learning systems, intelligent computer-based training, electronic manuals, school information systems, online classes and discussion forums, computer-aided testing system [11]. The data have typical characteristics, such as multiple levels of hierarchy (a level for subject, a level for grading, a level for question), the context (a specific student in a particular class answers to a specific question in a particular time on a particular date), short time data (recording data with different resolutions to facilitate various analyses, for example, to record data every 20 seconds) and long periods of time data (a big amount of data recorded over many sessions over an extended period of time, for example, covering semester and yearly courses) [12].EDM analyzes the data by any type of information system, supporting training or education (universities, schools, colleges and other academic or professional education institutions, providing traditional and modern forms and methods of training, and informal learning). These data are not limited to the interaction of individual students with the educational system (for example, data entry in the tests, navigating through the training and testing system, interactive exercises), but may also include data about the cooperation of students (e.g. text chat), administrative data (e.g. school, district, teacher), demographics (e.g. gender, age, school classes), student emotionality (e.g. motivation, emotional state) and so on. Since the main purpose of Data mining in the field of education is to greatly improve the quality of training, it is more difficult to get quantitative measurements than in other areas, and the results should be evaluated through indicators like improving efficiency. Thus, a data-driven decisions are formed aiming to improve the current educational processes and teaching materials. EDM is often used when working with educational programs, in solving problems of modeling student’s behavior and forecasting of the course results. Examples of problems solved with the help of EDM, are: Monitoring the progress of learning to detect in real time the undesirable behavior of students, such as the termination of training, low motivation, incorrect use of educational forums, abuse, fraud, etc., creating warnings to the parties concerned [13], provision feedback to the teachers in order to support decision-making on the improvement of student learning, the adoption of pre-emptive actions to remedy the situation [10]; 3rd International conference “Information Technology and Nanotechnology 2017” 116 Data Science / K. Grigorova, E. Malysheva, S. Bobrovskiy Predicting student achievement, assessment of knowledge and learning outcomes [10], formation of recommendations to students based on their interests and activities in the learning process [14]; Individual approach, adapting training to each student, including course content, navigation on the course, the presentation of the material [15], [16], identification the groups of students according to their individual characteristics, personal characteristics, features of the training, etc. [17] [18]; Building a curriculum and educational content [19], [20], planning and scheduling of future courses, course planning, planning of resource allocation, organization of access to learning materials, planning consultations, curriculum development, etc. [21]; Development and validation of scientific theories on learning technology, the formation of new scientific hypotheses [22], simulation the domain teaching instructions in terms of concepts, skills, training modules and their relationships [23]; User / Student modeling (Cognitive models of students presenting their skills and knowledge) [24], estimation of parameters of probability models based on data about learning to determine the likelihood of events of interest [25]. A variety of problems and their educational performance leads to the need to adapt methods of Data mining and Process mining to these data and problems. The applicability of Data mining techniques in the field of education are considered in [12], [26]. 4. Data Mining and Process Mining methods in EDM and e-Learning systems In Educational Data Mining, the most commonly used methods are Classification, Clustering, Text mining (text data mining and text analytics) and Relationship mining, Knowledge tracing, Bayesian modeling, Social network analysis, as well as the Detection of anomalies, Discovery with models, Distillation of data for human judgment, Nonnegative Matrix factorization and techniques and algorithms of Process mining, such as Alpha-algorithms, Heuristic algorithms, Probabilistic algorithms, Genetic algorithms, etc. Prediction – a definition of how the target attribute depends on a combination of other attributes. The types of prediction methods are: classification (target variable is a category), regression (target and background variables are numbers), the density score (predicted value is the probability density function). Using these methods to predict student performance and to determine the pattern of student behavior is considered in [27] and [28]. Clustering is identification of groups of similar instances. Typically, to determine the similarity the distance measure is used. After the set of clusters is determined, new items can be classified according to the nearest cluster. The clustering in EDM can be used to group similar course materials or to form groups of students based on their knowledge and patterns of interactions [29], [30]. Examples of the applicability of various types of clustering algorithms in EDM are discussed in [31]. Text Mining is a method of producing high-quality information from text. Typical tasks include text mining categorization of text, text clustering, concept / entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling. In the EDM, text mining was used to analyze the content of discussion boards, forums, chats, Web pages, documents, and so on. [32]. Relationship Mining allows us to determine the relationships between the variables and presenting them in the form of rules for subsequent use. There are different types of relationship mining, such as association rule mining (relations between variables), sequential pattern mining (temporal association between variables), correlation mining (linear correlation between variables) and causal data mining (the causal relationships between variables). Relationship mining can be used to determine the relationships in student behaviors (behavior patterns) and to diagnose difficulties in teaching or the mistakes that often occur together. [33] Knowledge Tracing (KT) is a popular method to assess student skills, which is used in effective cognitive tutor systems [34]. KT uses a cognitive model that maps problem-solving item required skills and records correct and incorrect responses of students as evidence of their knowledge of a particular skill. It monitors students' knowledge for some time, and parameterizes them by four variables. KT corresponds to the method of Bayesian network. Social Network Analysis (SNA) is to understand and to measure the relationship between the entities in the network information. SNA considers social relationships in terms of network theory consisting of nodes (representing individual actors within the network) and the connections or ties (which represent relationships between individuals, such as friendship, kinship, organizational position, etc.). In the EDM Social Network Analysis can be used to obtain information to interpret and analyze the structure and relationships in the interaction tasks, including interaction with the communications [35]. Outlier Detection - is to identify the data that are significantly different of rest of the data. Abnormal values correspond to the observations (or measurements), which are usually more or less than other values. The EDM anomaly detection can be used for the detection of students with learning difficulties, deviations in the actions or behavior of a student or a teacher, and for the detection of irregular learning processes [36]. Discovery with Models is to use previously tested phenomena model (using a prediction, clustering, or manual knowledge engineering) as a component of another kind of analysis such as prediction or relationship mining [37]. This method is often used in EDM and supports the identification of the relationship between the student's behavior and its characteristics, the use of psychometric modeling systems in machine-learning models, the analysis of research in various fields of study [38]. Distillation of Data for Human Judgment is to present the data in an understandable form using generalization, visualization and interactive interfaces to extract useful information and to support decision making. This method comprises obtaining statistical data about the learning process to determine the common characteristics, obtaining summary data and reports on the 3rd International conference “Information Technology and Nanotechnology 2017” 117 Data Science / K. Grigorova, E. Malysheva, S. Bobrovskiy behavior of the trainee. Data visualization and graphical techniques help to see, explore and understand large amounts of educational data immediately. In the EDM is also known as distillation for human judgment [39] and it has been used to assist teachers with the visualization and analysis of the students activity and the use of the information [40]. Nonnegative Matrix Factorization (NMF) is a technique that involves a clear interpretation in terms of Q-matrix, also referred to as transfer model [41]. There are many NMF algorithm, and they can give different solutions. NMF uses an array of positive numbers is the product of two smaller matrices. For example, when a learning process is considered, the matrix may represent the results of students’ testing and can be decomposed into two matrices: Q, which represents learning elements and S, representing each student's skills. The extraction of knowledge about the process in the learning systems from event logs for the full representation of the entire process, its analysis and improvement is the purpose of Process mining. In the EDM Process mining can be used to present the students’ behavior according to the records in the event log. Data about each event contain the time stamp and the data about learning process. This may be information about students' knowledge assessment [42], information on participation in forums and chats, about lectures and other educational materials viewing, information about passing tests [43], data describing the collaborative learning processes [44], information about events related to the metacognitive prompts [45]. Depending on the behavior of students, they can be combined into different groups. It is important to define the concept of the event (it could be a mouse click) and the concept of the sequence of events. For visualization of individual events Dotted Chart diagrams are often used. Further a construction of process models and conformance checking take place. To construct and test learning process models the general and special Process mining algorithms are used (alpha-algorithms, probabilistic, heuristic and genetic algorithms) as well as the Data mining methods and algorithms. The process model is usually presented in the form of a BPMN model or as a Petri net. Building of the learning process model is complicated by the existence of loops and parallel tasks, the presence of "noise", the mutual influence of some tasks to others. Unfortunately, in the Russian scientific journals, in spite of the considerable amount of work in the field of data mining, there are still little scientific papers related to the study of the application of Data mining and Process mining technology in the learning process. Among them there are the use of artificial neural networks in the modeling of educational process in high school [46], the study of the structure of high school students values by means of cluster analysis [47], the use of methods of Educational Data Mining and Learning Analytics in the educational qualifications [48], the study of the factors of adaptation of students to training conditions with the help of the analysis of variance method [49], an overview of the tasks and methods of Data mining in the field of education and the use of classification algorithms for data analysis of training systems [50]. 5. Software products with the capabilities of Data mining and Process mining Special software is necessary for the implementation of Data mining and Process mining. More and more software vendors add to their software products such features. Examples of software products with the capabilities of Data mining and Process mining are presented in Table 1. Table 1. Examples of software products with the capabilities of Data mining and Process mining. Tool Name Vendor Website Celonis Process Mining Celonis GmbH www.celonis.de Disco Fluxicon www.fluxicon.com Minit Gradient ECM www.minitlabs.com NLTK Open Source www.nltk.org Orange Open Source orange.biolab.si Perceptive Process Mining Lexmark www.lexmark.com ProM Open Source www.promtools.org ProM Lite Open Source www.promtools.org QPR ProcessAnalyzer QPR www.qpr.com RapidProM Open Source www.rapidprom.org RapidMiner Open Source www.rapidminer.com Rialto Process Exeura www.exeura.eu SNP Business Process Analysis SNP AG www.snp-bpa.com SPSS IBM www-01.ibm.com/software WEKA Open Source www.cs.waikato.ac.nz/ml/weka/ One of the commonly used software is freeware ProM. ProM has over 1,500 plug-ins, allowing the use of different methods and algorithms for Data mining and Process mining, different types of data and models, to convert the data and models, etc., and 3rd International conference “Information Technology and Nanotechnology 2017” 118 Data Science / K. Grigorova, E. Malysheva, S. Bobrovskiy the version ProM Lite contains the most commonly used modules. Most commercial software products, including Data mining and Process mining, are easy to use. Approximately 40 software products, often used in Data mining in the field of education are given in [6]. 6. Scientific conferences and journals in the field of Educational Data Mining EDM became an independent research area in recent years. It includes research on the training of intellectual systems - Intelligent tutoring systems (ITS), Artificial intelligence in education (AIED), User modeling (UM), Technology-enhanced learning (TEL), as well as Adaptive and intelligent educational hypermedia (AIEH). The first conference EDM2008 is held in Montreal, Canada; EDM2009 in Cordoba, Spain; EDM2010 in Pittsburgh, USA; EDM2011 in Eindhoven, the Netherlands; EDM2012 in Chania, Greece, EDM2013 in Memphis, USA, EDM2014 in London, UK, EDM2015 in Madrid, Spain, and EDM2016 in Raleigh, USA, EDM2017 in Wuhan, China. Table 2 summarizes some of the conferences that correspond to the field of EDM. Table 2. Scientific conferences that correspond to the category EDM. Title Short title Type Starting year International Conference on Artificial AIED every two years 1983 Intelligence in Education International Conference on Educational Data EDM annual 2008 Mining International Conference on Intelligent ITS every two years 1988 Tutoring Systems International Conference on Learning LAK annual 2011 Analytics and Knowledge International Conference on User Modeling, UMAP annual 2009 Adaptation, and Personalization Table 3 provides examples of journals corresponding to the field of EDM. Table 3. Examples of journals corresponding to the field of EDM. Title Short title Publisher ACM Special Interest Group on Knowledge SIGKDD ACM Discovery and Data Mining, Explorations Explorations Computer and Education CAE Elsevier IEEE Transactions on Knowledge and Data TKDE IEEE Engineering IEEE Transactions on Learning Technologies TLT IEEE Internet and Higher Education INTHIG Elsevier International Journal of Artificial Intelligence in IJAIED AIED Society Education Journal of Educational and Behavioral Statistics JEBS SAGE Publications Journal of Educational Data Mining JEDM EDM Society Journal of the Learning Sciences J Learn Sci Taylor&Francis User Modeling and User-Adapted Interaction UMUAI Springer Most accurately the theme of the domain is presented in Journal of Educational Data Mining (http://www.educationaldatamining.org/JEDM/), published since 2009. Journal of Educational Data Mining is available as an online journal with free access. 7. Conclusion The paper discusses the basic principles of research in EDM domain, some examples of tasks that can be solved by the use of data mining and Process mining in the area of traditional and e-learning are given, the possibilities and limitations of different methods are described, an overview of the major scientific conferences and journals devoted to the application of Data mining and Process mining techniques in education is presented. EDM allows investigation on the content of learning materials in e-learning systems and the processes performed in it to be carried out. 3rd International conference “Information Technology and Nanotechnology 2017” 119 Data Science / K. Grigorova, E. Malysheva, S. Bobrovskiy The use of Information and Communication Technologies in education generates a large amount of data that contains comprehensive information for students, the processes through which they pass in the course of education. The data derived and used by stakeholders (teachers, instructors, etc.) to understand the learning habits of students, the factors affecting their performance and skills they acquire can be examined. To answer these questions, the research interest in the use of Data mining in education increases. EDM is a discipline aimed at developing specific methods to study educational databases generated by any type of information system supporting training or education (schools, colleges, universities, or vocational training institutions offering traditional and/ or modern methods teaching and informal learning). EDM brings together researchers and practitioners from computer science, education, psychology, psychometrics, and statistics. The basic idea of Process mining is detecting, monitoring and improvement of real processes by extracting knowledge from event logs automatically recorded by information systems. This approach can be applied to the problems of education. The main goals in this direction are: The extraction of process-related knowledge from large education event logs, such as: process models following key performance indicators or a set of curriculum pattern templates. The analysis of educational processes and their conformance with established curriculum constraints, educators’ hypothesis and prerequisites. The enhancement of educational process models with performance indicators: execution time, bottlenecks, decision point, etc. The personalization of educational processes via the recommendation of the best course units or learning paths to students (depending on their profiles, their preferences or their target skills) and the on-line detection of prerequisites’ violations. It can be concluded that the use of complementary methods of Data mining and Process mining in e-Learning systems can improve the quality of teaching, increase its availability and effectiveness. Acknowledgements This work is supported by the Bulgarian National Scientific Research Fund under the contract DFNI - I02/13. References [1] Van der Aalst WMP. Process Mining: Discovery, Conformance and Enhancement of Business Processes. Berlin: Springer-Verlag, 2011; 370 p. [2] Gartner Inc. IT Glossary. URL: http://www.gartner.com/it-glossary/data-mining (21.01.2017). [3] SAS Institute Inc. URL: http://www.sas.com/en_us/insights/analytics/data-mining.html (21.01.2017). [4] SIGKDD. URL: http://www.kdd.org/curriculum/index.html (21.01.2 017). [5] IEEE Task Force on Process Mining. Process Mining Manifesto. URL: http://www.processmining.org/blogs/pub2012/process_mining_manifesto (21.01.2017). [6] Slater S, Joksimović S, Kovanovic V, Baker RSJd, Gasevic D. Tools for Educational Data Mining: A Review. Journal of Educational and Behavioral Statistics 2017; 42(1): 85–106. [7] Romero CE, Ventura S, Salcines E. Data mining in course management systems: Moodle case study and tutorial. Comput Edu 2008; 51: 368–384. [8] Mostow J, Beck J. Some useful tactics to modify, map and mine data from intelligent tutors. J Nat Lang Eng. 2006; 12: 195–208. [9] Merceron A, Yacef K. Mining student data captured from a web-based tutoring tool: initial exploration and results. J Interact Learn Res 2004; 15: 319–346. [10] Romero C, Zafra A, Luna JM, Ventura S. Association rule mining using genetic programming to provide feedback to instructors from multiple-choice quiz data. Expert Syst J Knowl Eng. 2013; 30(2): 162-172. [11] Romero C, Ventura S, Pechenizkiy M, Baker RSJd. Handbook of Educational Data Mining. Chapman & Hall/CRC Press, 2011; 526 р. [12] Romero C, Ventura S. Data mining in education. The Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2013; 3: 12–27. [13] Kotsiantis S, Patriarcheas K, Xenos MN. A combinational incremental ensemble of classifiers as a technique for predicting student’s performance in distance education. Knowl-Based Syst. 2010; 23: 529–535. [14] Tang T, Daniel BK, Romero C. Recommender systems for and in social and online learning environments. Expert Syst J Knowl Eng.2015; 32(2): 261–263. [15] Romero C, Ventura S. Preface to the special issue on data mining for personalised educational systems. User Model User-Adapted Interact. 2011; 21: 1–3. [16] Bannert M, Reimann P, Sonnenberg C. Process mining techniques for analyzing patterns and strategies in students’ self-regulated learning. Metacognition and Learning 2014; 9(2): 161–185. [17] Bouchet F, Harley JM, Trevors GJ, Azevedo R. Clustering and profiling students according to their interactions with an intelligent tutoring system fostering self-regulated learning. Journal of Educational Data Mining 2013; 5(1): 104–146. [18] Ayers E, Nugent R, Dean N. A comparison of student skill knowledge estimates. International Conference On Educational Data Mining. Cordoba, Spain, 2009; 1–10. [19] Cairns AH, Gueni B, Fhima M, Cairns A, David S, Khelifa N. Towards Custom-Designed Professional Training Contents and Curriculums through Educational Process Mining. The Fourth International Conference on Advances in Information Mining and Management, 2014; 53–58. [20] Garcia E, Romero C, Ventura S, Castro C. Collaborative data mining tool for education. International Conference on Educational Data Mining. Cordoba, Spain, 2009; 299–306. [21] Hsia T, Shie A, Chen L. Course planning of extension education tomeet market demand by using datamining techniques—an example of Chinkuo Technology University in Taiwan. Expert Syst Appl J. 2008; 34: 596–602. [22] Siemens G, Baker RSJd. Learning analytics and educational data mining: towards communication and collaboration. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge. Vancouver, British Columbia, Canada, 2012; 1–3. [23] Pavlik P, Cen H, Koedinger K. Learning factors transfer analysis: using learning curve analysis to automatically generate domain models. Int Conf Edu Data Min. 2009; 121–130. [24] Frias-Martinez E, Chen S, Liu X. Survey of datamining approaches to user modeling for adaptive hypermedia. IEEE Trans Syst Man Cybern C. 2006; 36(6): 734–749. [25] Wauters K, Desmet P, Noortgate W. Acquiring item difficulty estimates: a collaborative effort of data and judgment. International Conference on Educational Data Mining. Eindhoven, The Netherlands, 2011; 121–128. [26] Baker R, Siemens G. Educational data mining and learning analytics. Cambridge Handbook of the Learning Sciences: 2nd Edition, 2014: 253–274. 3rd International conference “Information Technology and Nanotechnology 2017” 120 Data Science / K. Grigorova, E. Malysheva, S. Bobrovskiy [27] Romero C, Espejo P, Zafra A, Romero J, Ventura S. Web usage mining for predicting marks of students that use Moodle courses. Comput Appl Eng Edu J. 2013; 21: 135–146. [28] Baker RSJd, Gowda SM, Corbett AT. Automatically detecting a student’s preparation for future learning: help use is key. Fourth International Conference on Educational Data Mining. Eindhoven, The Netherlands, 2011; 179–188. [29] Bogarín A, Romero C, Cerezo R, Sánchez-Santillán M. Clustering for improving educational process mining. Proceedings of the Fourth International Conference on Learning Analytics And Knowledge. ACM - New York, NY, USA, 2014; 11–15. [30] Vellido A, Castro F, Nebot A. Clustering Educational Data. Handbook of Educational Data Mining. Boca Raton, FL: Chapman and Hall/CRC Press, 2011; 75– 92. [31] Dutt A, Aghabozrgi S, Ismail MAB, Mahroeian H. Clustering Algorithms Applied in Educational Data Mining. International Journal of Information and Electronics Engineering 2015; 5(2): 112–116. [32] Tane J, Schmitz C, Stumme G. Semantic resource management for the web: an e-learning application. International Conference of the WWW. New York, 2004; 1–10. [33] Merceron A, Yacef K. Measuring correlation of strong symmetric association rules in educational data. Handbook of Educational Data Mining. Boca Raton, FL: CRC Press, 2011; 245–256. [34] Corbett A, Anderson J. Knowledge tracing: modeling the acquisition of procedural knowledge. User Model User-Adapted Interact 1995; 4: 253–278. [35] Rabbany R, Takaffoli M, Zaïane O. Analyzing participation of students in online courses using social network analysis techniques. International Conference on Educational Data Mining. Eindhoven, The Netherlands, 2011; 21–30. [36] Ueno M. Online outlier detection system for learning time data in e-learning and its evaluation. International Conference on Computers and Advanced Technology in Education. Beijiing, China, 2004l; 248–253. [37] Baker RSJd, Yacef K. The state of educational data mining in 2009: a review and future visions. J Edu Data Min. 2009; 3–17. [38] Bienkowski M, Feng M, Means B. Enhancing teaching and learning through educational data mining and learning analytics: an issue brief. Washington, D.C.: Office of Educational Technology. U.S. Department of Education, 2012; 1–57. [39] Baker RSJd. Data mining for education. International Encyclopedia of Education. 3rd ed. Oxford, UK: Elsevier, 2010; 7: 112–118. [40] Mazza R, Milani C. GISMO: a graphical interactive student monitoring tool for course management systems. International Conference on Technology Enhanced Learning. Milan, Italy, 2004; 1–8. [41] Desmarais MC. Mapping question items to skills with non-negative matrix factorization. ACM SIGKDD Explor. 2011; 13: 30–36. [42] Trˇcka N, Pechenizkiy M, van der Aalst W. Process mining from educational data. Handbook of Educational Data Mining. Boca Raton, FL: CRC Press, 2011; 123–142. [43] Mukala P, Buijs J, Leemans M, van der Aalst W. Learning Analytics on Coursera Event Data: A Process Mining Approach. 5th International Symposium on Data-driven Process Discovery and Analysis. Vienna, Austria, 2015; 18–32. [44] Schoor C, Bannert M. Exploring regulatory processes during a computer-supported collaborative learning task using process mining. Computers in Human Behavior 2012; 28: 1321–1331. [45] Sonnenberg C, Bannert M. Discovering the effects of metacognitive prompts on the sequential structure of SRL-processes using process mining techniques. Journal of Learning Analytics 2015; 2(1): 72–100. [46] Petrova MV, Anufrieva DA. Investigation of the possibilities of methods of intellectual data analysis in modeling the educational process in the university. Vestnik Chuvashskogo Universiteta 2013; 3: 280–285. (in Russian) [47] Avadehni YuI, Kulikova OM, Radionova VA. The study of the structure of values of university students with the use of data mining technologies. Sovremennye problemy nauki i obrazovaniya 2013; 6: 841 p. (in Russian) [48] Veryaev AA, Tatarnikova GV. Educational Data Mining i Learning Analytics - directions of development of educational qualification. Prepodavatel' ХХI vek 2016; 2: 150–160. (in Russian) [49] Shumetov VG, Lyaskovskaya OV. Study of the factors of adaptation of the students of the 2000s to the training in the university by the methods of data Mining. Srednerusskij vestnik obshchestvennyh nauk 2015; 6: 49–56. (in Russian) [50] Gorlushkina NN, Kocyuba IY, Hlopotov MV. The tasks and methods of intellectual analysis of educational data to support decision-making. Obrazovatel'nye tekhnologii i obshchestvo 2015; 1: 472–482. (in Russian) 3rd International conference “Information Technology and Nanotechnology 2017” 121