A Systematic Mapping Study on the Usage of Software Tools for Graphs within the EDM Community Vladimir Ivančević* Ivan Luković University of Novi Sad, Faculty of Technical Sciences University of Novi Sad, Faculty of Technical Sciences Trg Dositeja Obradovića 6 Trg Dositeja Obradovića 6 21000 Novi Sad, Serbia 21000 Novi Sad, Serbia dragoman@uns.ac.rs ivan@uns.ac.rs ABSTRACT be also evidenced by the appearance of the Workshop on Graph- The field of educational data mining (EDM) has been slowly Based Educational Data Mining (G-EDM)1 in 2014. As a result, expanding to embrace various graph-based approaches to software tools that help researchers or any other user group to interpretation and analysis of educational data. However, there is utilize graphs or graph-based structures (for brevity these will be a great wealth of software tools for graph creation, visualization, referred to as graph tools) are becoming a valuable resource for and analysis, both general-purpose and domain-specific, which both the G-EDM and the broader EDM community. As graphs are may discourage EDM practitioners from finding a tool suitable for only slowly gaining wider recognition in EDM, there could still their graph-related problem. For this reason, we conducted a be a lot of questions about which graph tools exist or what systematic mapping study on the usage of software tools for educational tasks might be supported by these tools. graphs in the EDM domain. By analysing papers from the In an attempt to help EDM researchers discover more useful proceedings of previous EDM conferences we tried to understand information about potentially suitable graph tools, we reviewed how and to what end graph tools were used, as well as whether the papers presented at the past EDM conferences, selected those researchers faced any particular challenges in those cases. In this that mentioned any usage of graph tools, and extracted from them paper, we compile studies that relied on graph tools and provide information about which graph tools the authors employed, what answers to the posed questions. features of these tools were used, to what end the research in question was conducted, and if there were any particular Keywords challenges while using these tools. Systematic Mapping Study, Graphs, Software Tools, Educational The present study may be classified as a secondary study since we Data Mining. base our approach on collecting other research works and assembling relevant information from them. Secondary studies 1. INTRODUCTION might be more typical of medical and social sciences but there are The field of educational data mining (EDM) has significantly proposed methodologies concerning secondary studies in software expanded over the past two decades. It has attracted numerous engineering as well [13]. Two kinds of secondary studies might be researchers with various backgrounds around the common goal of particularly important in this context: systematic review studies understanding educational data through intelligent analysis and and systematic mapping studies [20]. In both cases, there is a clear using the extracted knowledge to improve and facilitate learning, methodology that is set to reduce bias when selecting other as well as educational process. In 2010, Romero and Ventura research works, which gives these secondary studies the quality of published a comprehensive overview of the field with 306 being systematic. Some of the differences pointed out by Petersen references [26]. In this review, the authors identified 11 categories et al. [20] are that systematic reviews tend to focus on the quality of educational tasks, two of which dealt with graph structures (for of reviewed studies with the aim of identifying best practices, brevity these will be referred to as graphs): social network while systematic maps focus more on classification and thematic analysis (SNA) and developing concept maps. However, the analysis but with less detailed evaluation of collected studies. authors noted that these two categories featured a lower number of Moreover, the same authors consider that the two study types papers (15 or less references collected). Somewhat different form a continuum, which might complicate some attempts at categories of work were presented in another review of EDM [2] categorization. but they did not include any explicit references to graphs. We categorize the present study as a systematic mapping study. However, since that time, the interest in approaches and This classification is justified by the fact that: technologies utilizing graphs has increased within EDM. In 1. we employed a concrete methodology, addition to the results of a literature search on the topic, this could 2. we did not evaluate the quality of collected papers or *Corresponding Author the presented results, but 3. we focused on identifying the employed graph tools and the manner in which these tools were used, with the aim 1 http://ceur-ws.org/Vol -1183/ of providing an overview of the current practice of 3. Proceedings of the 3rd International Conference on using graph tools within the EDM community Educational Data Mining 2010 (Pittsburgh, Pennsylvania, USA) However, we did not restrict our investigation to analysing exclusively titles, abstracts, or keywords, but went through the 4. Proceedings of the 4th International Conference on complete texts to find the necessary information. This aspect Educational Data Mining 2011 (Eindhoven, might better suit systematic reviews, but it does not change the Netherlands) principal goal or character of our study. 5. Proceedings of the 5th International Conference on The exact details of the employed methodology, including the Educational Data Mining 2012 (Chania, Greece) research questions, sources of studies, and study selection criteria, 6. Proceedings of the 6th International Conference on are given in Section 2. Section 3 contains the answers to the Educational Data Mining 2013 (Memphis, Tennessee, research question, most importantly the list of identified graph USA) tools and the trends in their usage in EDM. Section 4 covers the potential limitations of the present study. 7. Proceedings of the 7th International Conference on Educational Data Mining 2014 (London, UK) 2. METHODOLOGY 8. Extended Proceedings of the 7th International We mainly followed the guidelines given in [20] but also relied Conference on Educational Data Mining 2014 (London, on the example of a mapping study presented in [21]. Given the UK), which included only the workshop papers specificity of our study and the posed research questions, there were some necessary deviations from the standard suggested All the proceedings are freely offered as PDF files by the procedure. The overall process of selecting papers and extracting International Society of Educational Data Mining2 and may be information, together with the resolution methods for non- accessed through a dedicated web page.3 standard cases, is presented and discussed in the following The papers from these proceeding represented our Level 0 (L0) subsections. papers, i.e., the starting set of 494 papers. This set included different categories of papers: full (regular) papers, short papers, 2.1 Overview different subcategories of posters, as well as works from the The first step was defining research questions to be answered by young researcher track (YRT) or demos/interactive events. The the present study. The choice of research questions influenced the starting set did not include abstracts of invited talks (keynotes), subsequent steps: conducting the search for papers, screening the prefaces of proceedings, or workshop summaries. papers, devising the classification scheme, extracting data, and creating a map. These papers were then searched and evaluated against our keyword criterion (KC), which led to a set of Level 1 (L1) papers. 2.2 Research Questions Our keyword string is of the form KC1 AND KC2 where KC1 We defined four principal research questions (RQ1-RQ4) and KC2 are defined in the following manner: concerning the use of graphs and graph tools in studies by EDM researchers: • KC1: graph OR subgraph OR clique • RQ1: Which graph tools were directly employed by • KC2: tool OR application OR software OR framework researchers in their studies? OR suite OR package OR toolkit OR environment OR editor • RQ2: Which features of the employed graph tools were used by researchers? The first part of the criterion (KC1) was defined to restrict the choice to papers that dealt with graphs, while the second part • RQ3: What was the overall purpose of the research that (KC2) served to narrow down the initial set of papers to those involved or relied on graph tools? mentioning some kind of a tool or program in general. • RQ4: What features did researchers consider to be When evaluating KC on each L0 paper, we did a case-insensitive missing or inadequate in the employed graph tools? search for whole words only, whether in their singular form (as written in KC1 and KC2) or their plural form (except for the case 2.3 Search for Papers of “software”). This search also included hyphenated forms that We searched through all the papers that were published in the featured one of the keywords from KC, e.g., “sub-graph” was proceedings of the EDM conference series till this date, i.e., considered to match the “graph” keyword. papers from the first EDM conference in 2008 to the latest, seventh, EDM conference in 2014. The latest EDM conference As each proceedings file is a PDF document, we implemented a was special because it also included four workshops (G-EDM search in the Java programming language using the Apache being one of them) for the first time. The papers from these PDFBox4 library for PDF manipulation in Java. However, when workshops were also considered in our search. This amounted to extracting content from some papers, i.e., page ranges of a eight relevant conference proceedings that represented the proceedings file, we could not retrieve text in English that could complete source of research works for our study: be easily searched. This was most probably caused by the fact that 1. Proceedings of the 1st International Conference on 2 http ://www.educationaldatamining.org/ Educational Data Mining 2008 (Montreal, Canada) 3 http://www.educationaldatamining.org/proceedings 2. Proceedings of the 2nd International Conference on 4 Educational Data Mining 2009 (Cordoba, Spain) https://pdfbox.apache.org/ authors used different tools to produce camera ready versions in 2.5 Classification Scheme PDF, which were later integrated into a single PDF file. The mode of tool usage was categorized in the following manner: In these instances, usually one of the two main problems 1. CREATION (C) – the tool was developed by the paper occurred: no valid text could be extracted or valid text was authors and introduced in the paper; extracted but without spacing. In the case of invalid text, we had to perform optical character recognition (OCR) on the 2. MODIFICATION (M) – the tool being modified, either problematic page ranges. We used the OCR feature of PDF- through source code or by adding extensions/plugins; XChange Viewer,5 which was sufficient as confirmed by our and. manual inspection of the problematic page ranges (six problematic 3. UTILIZATION (U) – the tool being utilized without papers in total). In the case of missing spacing, we had to fine- modification. tune the extraction process using the capabilities of the PDFBox library. We also checked the distribution of the collected studies by the continent and the country corresponding to the authors’ This PDF library proved adequate for our task because we had to affiliation. In cases when there were authors from different search only through PDF files and could customize the text countries, we indicated the country of the majority of authors, or, extraction process to solve the spacing problem. However, in the if there was no majority then the country corresponding to the case of a more varied data source, a more advanced toolkit for affiliation of the first author. content indexing and analysis would be needed. 2.4 Screening of Papers 2.6 Data Extraction and Map Creation Relevant data from L3 papers was extracted into a table that for EDM researchers used many of our keywords with several each paper included the following information: author list, title, different meanings, e.g., a graph could denote a structure proceedings where it was published, page range within the consisting of nodes and edges, which was the meaning that we proceedings, answers to the research question and classifications looked for, or some form of a plot. In order to determine the final according to the scheme presented in the previous subsection. set of papers we performed a two-phase selection on L1 papers: 1. We examined the portions of L1 papers that contained 3. RESULTS AND DISCUSSION some KC1 keyword and eliminated papers that did not An overview of the paper selection process is given in Table 1. In significantly deal with graphs (as structures) – this led each step, the number of relevant papers is significantly reduced. to a set of Level 2 (L2) papers. As expected, the required effort in paper analysis was inversely 2. We read each L2 paper and eliminated those that did not proportional to the number of selected papers. In the L1 step, the mention some use of graphs tools – this led to the final usage of the keyword criterion relatively quickly eliminated many set of Level 3 (L3) papers. papers. However, in subsequent steps, the selected papers had to be read, either partially (in the L2 step) or fully (in the L3 step). In the first phase of selection, we examined the sentences that The set of L3 papers represents a selection of EDM studies that contain KC1 keywords. If this proved insufficient to determine the were used to identify the usage patterns concerning graph tools. nature or scope of use of the mentioned graphs, we read the whole The list of the selected papers is publicly available.6 paragraph, and sometimes even the paragraph before and the paragraph after. In these cases, we also checked the referenced Table 1. The number of selected papers at each step figures, tables, or titles of the cited papers. If there were still any Step Number of papers doubts, we consulted the paper’s title and abstract, as well as glanced over the figures looking for graph examples. If the L0 – papers from EDM proceedings 494 authors did not use graphs in their presented study or just made a L1 – papers containing keywords 146 short comment about graphs giving an analogy or mentioning graphs in the context of related or future work, we did not select L2 – papers mentioning graphs 82 the paper for the next phase. L3 – papers mentioning graph tools 27 In the second phase of selection, we kept only those papers that mention explicit use of a graph tool by the authors. In the cases Most studies (15) are from North America: USA (14) and Canada when the actual use of a mentioned graph tool was not clear, the (1). Europe is represented by 8 studies from 6 countries: Czech paper was selected if some of its figures contain a screenshot Republic (2), Spain (2), Germany (1), Ireland (1), Russia (1), and featuring the tool or a graph visualized using that tool. UK (1). The remaining two continents represented are Asia The term tool was considered rather broadly in the present study. (Japan only) and Australia, each providing 2 studies. This We did not restrict the search only to well-rounded software somewhat resembles the EDM community present at the EDM applications, but also included libraries for various computer conferences and differs little from the structure of the EDM languages, and even computer languages or file formats that were community as reported in 2009 [2]. used by researchers to manipulate graphs. By making this decision, we aimed to provide a greater breadth of information to 3.1 Overview of Graph Tools researchers interested in applying graphs within their studies. In Table 2, we list 28 graph tools mentioned in the 27 selected papers. 5 6 http://www.tracker-software.com/product/pdf-xchange-viewer http://www.acs.uns.ac.rs/en/user/31 Table 2. Overview of graph tools from the selected papers No Tool Usage Features Purpose Issues mining [14] C , augmented graph grammar engine with analyse student-produced argument inefficiency 3 AGG Engine U[15] recursive graph matching diagrams in some cases collect bullying data via web-form and 4 CASSI C[19] support classroom management / use them to form a social graph CLOVER 5 U[25] generate graph vis. (used in vis. in No. 2) / framework provide a list of concepts and linking 6 Cmate U[16] tabletop concept mapping / words to build a concept map 7 D3.js U[17] program interactive graph vis. facilitate graph interpretation in EDA / [28] 8 DOT U describe graphs (used in export in No. 14) / C[9], understand student problem solving in 9 EDM Vis interactively vis. ITS log data WIP M[10] ITSs 10 eJUNG lib. U[11] layout graphs (used in vis. in No. 14) / FuzzyMiner [16] generate fuzzy models (of student discover and analyse student strategies in 11 U / (ProM) collaboration processes) tabletop collaboration identify similarities between LE course 12 Gephi U[7] vis. graphs content / (used together with No. 22) describe graphs analyse student solutions of resolution 13 graphML U[30] / (of student resolution proofs) proofs [11] C , 14 InVis interactively vis. and edit ITS log data understand student interaction in ITSs WIP M[12, 28] interactively vis. learning object understand how students perform and 15 LeMo C[18] / networks succeed with resources in LMSs and LPs Meerkat-ED vis., monitor, and evaluate participation analyse student interaction and messages 16 C[22] / toolbox of students in discussion forums in discussion forums 17 meud U[24] create diagrams (concept lattices) analyse choices of study programmes / [6] study SNA metrics to improve student 18 Ora U calculate SNA metrics / performance classifiers use student social data to predict drop- vis. networks and calculate network 19 pajek U[3],[32] out and failure; understand growth of / measures communities on SNSs explore ELE interaction data and 20 R U[8] use scripts to vis. ELE interaction data WIP improve ELEs compare student problem solving- R – igraph 21 U[5],[32] create, refine, vis., and analyse networks approaches in ITSs; understand growth / package of communities on SNSs identify similarities between LE course 22 RapidMiner M[7] create an operator for graph generation content / (used together with No. 12) 23 RSP C[4] discover issues in the ITS process support teachers through AT adaptation / SEMILAR assess student natural language input in 24 C[27] semantic similarity methods for text / toolkit ITSs generate graphs for student symbolic 25 SketchMiner C[29] assess student symbolic drawings in ITSs / drawings; compare and cluster drawings interactively vis. student interaction in understand student problem solving in 26 STG C[4] / ITSs ITSs perform analysis on content corpus and support development of instructional 27 TRADEM C[23] / generate a concept map in ITSs content in ITSs 28 Visone U[31] vis. and analyse SNs, clique analysis analyse user relationships in WBATs / The rows (graph tools) are ordered alphabetically by the tool researchers considering the use of graphs to solve educational name (the “Tool” column), which represents the answer to RQ1. problems. For future work, we plan to include other publication In general, we discovered a diverse list of infrequently used graph series, even those that are not solely devoted to the EDM research. tools. The usage of the graph tools, which represents the answer to The results of such an attempt could demonstrate whether EDM RQ2, is covered by the columns “Usage” and “Features”. In practitioners from other regions of the world are more represented “Usage”, we listed the mode of usage (see Section 2.5) and the in the graph-based research than indicated by the results of the references to the papers mentioning the graph tool. In “Features”, present study. we listed tool functionalities and capabilities that were created or employed by the researchers. The most often used feature was to visualize (vis.) graphs. The purpose of the selected studies, which 6. ACKNOWLEDGMENTS The research presented in this paper was supported by the represents the answer to RQ3, is given in the “Purpose” column. Ministry of Education, Science, and Technological Development Researchers often analysed data from various interrelated systems: of the Republic of Serbia under Grant III-44010: “Intelligent intelligent tutoring systems (ITSs) and adaptive tutorials (ATs), Systems for Software Product Development and Business Support learning environments (LEs) including exploratory learning based on Models”. environments (ELEs), learning management systems (LMSs), learning portals (LPs), social network services (SNSs), web-based authoring tools (WBATs), and web-based educational systems 7. REFERENCES (WBESs). Some frequent tasks were analysis of social networks [1] Abbas, S. and Sawamura, H. 2008. Argument mining using (SNs) and exploratory data analysis (EDA). highly structured argument repertoire. In Proceedings of the First International Conference on EDM (Montreal, Canada, The issues that the researchers faced when using the tools, which June 20 - 21, 2008). EDM’08. 202-209. represents the answer to RQ4, are listed in the “Issues” column. In the majority of the selected papers, the researchers did not discuss [2] Baker, R.S.J.d. and Yacef, K. 2009. The state of educational problems related to tool usage. The main exceptions are studies in data mining in 2009: A review and future visions. Journal of which researcher presented their own tools and discussed missing Educational Data Mining, 1 (1), 3-17. or incomplete features that should be fully implemented in future [3] Bayer, J., Bydzovska, H., Geryk, J., Obsivac, T. and – this was labelled as work in progress (WIP). Popelinsky, L. 2012. Predicting drop-out from social behaviour of students. In Proceedings of the Fifth 4. POTENTIAL LIMITATIONS International Conference on EDM (Chania, Greece, June 19 The findings might not be representative of the whole EDM - 21, 2012). EDM’12. 103-109. community but only of the practitioners who presented their work [4] Ben-Naim, D., Bain, M. and Marcus, N. 2009. A user-driven at one of the EDM conferences. An important issue in the analysis and data-driven approach for supporting teachers in was the lack of information about the used tools. There were reflection and adaptation of adaptive tutorials. In various instances when researchers obviously used a graph tool, Proceedings of the Second International Conference on or at least it could be expected that they relied on such tools, but EDM (Cordoba, Spain, July 1 - 3, 2009). EDM’09. 21-30. failed to report the information. [5] Eagle, M. and Barnes, T. 2014. Exploring differences in Moreover, we used a somewhat “relaxed” definition of a graph problem solving with data-driven approach maps. In tool. This allowed for the inclusion of both general-purpose tools Proceedings of the Seventh International Conference on for graph manipulation and domain-specific tools that were EDM (London, UK, July 4 - 7, 2014). EDM’14. 76-83. developed for educational domain but also utilize a graph-based [6] Garcia-Saiz, D., Palazuelos, C. and Zorrilla, M. 2014. The structure. The primary motive behind this choice was to provide a predictive power of SNA metrics in education. In list of graph tools potentially usable in a wider range of studies, as Proceedings of the Seventh International Conference on well as a list of tools that illustrates how graphs were implemented EDM (London, UK, July 4 - 7, 2014). EDM’14. 419-420. or used in a more specific problem. The former tool category generally includes tools associated with the “U” usage (tools [7] Goslin, K. and Hofmann, M. 2013. Identifying and utilized without modification), while the latter tool category visualizing the similarities between course content at a mostly covers tools associated with the “C” usage (new tools learning object, module and program level. In Proceedings of introduced by their authors). the Sixth International Conference on EDM (Memphis, TN, USA, July 6 - 9, 2013). EDM’13. 320-321. On the other hand, we excluded graph-based tools that could be labelled as data mining tools or causal modelling tools. For [8] Gutierrez-Santos, S., Mavrikis, M., Poulovassilis, A. and instance, some popular predictive and/or explanatory models Zhu, Z. 2014. Indicator visualisation for adaptive exploratory (decision trees, random forests, and Bayesian networks) are learning environments. In Proceedings of the Seventh graph-based, while causal modelling usually assumes creation or International Conference on EDM (London, UK, July 4 - 7, discovery of causal graphs. As these tools are more often featured 2014). EDM’14. 377-378. in EDM studies, we assumed that EDM researchers are more [9] Johnson, M.W. and Barnes, T. 2010. EDM visualization familiar with their usage, so the focus of the present study is on tool: Watching students learn. In Proceedings of the Third other less frequently used graph tools. International Conference on EDM (Pittsburgh, PA, USA, June 11 - 13, 2010). EDM’10. 297-298. 5. CONCLUSION [10] Johnson, M.W., Eagle, M.J., Joseph, L. and Barnes, T. 2011. We hope that the collected information about the usage of graph The EDM Vis tool. In Proceedings of the Fourth tools within the EDM community may prove valuable for International Conference on EDM (Eindhoven, The using social network analysis techniques. In Proceedings of Netherlands, July 6 - 8, 2011). EDM’11. 349-350. the Fourth International Conference on EDM (Eindhoven, [11] Johnson, M.W, Eagle, M. and Barnes, T. 2013. InVis: An The Netherlands, July 6 - 8, 2011). EDM’11. 21-30. interactive visualization tool for exploring interaction [23] Ray, F, Brawner, K. and Robson, R. 2014. Using data networks. In Proceedings of the Sixth International mining to automate ADDIE. In Proceedings of the Seventh Conference on EDM (Memphis, TN, USA, July 6 - 9, 2013). International Conference on EDM (London, UK, July 4 - 7, EDM’13. 82-89. 2014). EDM’14. 429-430. [12] Johnson, M.W., Eagle, M., Stamper, J. and Barnes, T. 2013. [24] Romashkin, N., Ignatov, D. and Kolotova, E. 2011. How An algorithm for reducing the complexity of interaction university entrants are choosing their department? Mining of networks. In Proceedings of the Sixth International university admission process with FCA taxonomies. In Conference on EDM (Memphis, TN, USA, July 6 - 9, 2013). Proceedings of the Fourth International Conference on EDM’13. 248-251. EDM (Eindhoven, The Netherlands, July 6 - 8, 2011). [13] Kitchenham, B. and Charters, S. 2007. Guidelines for EDM’11. 230-233. Performing Systematic Literature Reviews in Software [25] Romero, C., Gutierrez, S., Freire, M. and Ventura, S. 2008. Engineering. Technical Report EBSE-2007-01, School of Mining and visualizing visited trails in web-based Computer Science and Mathematics, Keele University. educational systems. In Proceedings of the First [14] Lynch, C.F. 2014. AGG: Augmented graph grammars for International Conference on EDM (Montreal, Canada, June complex heterogeneous data. In Extended Proceedings of the 20 - 21, 2008). EDM’08. 182-186. Seventh International Conference on EDM (London, UK, [26] Romero, C. and Ventura, S. 2010. Educational data mining: July 4 - 7, 2014). EDM’14. 37-42. A review of the state of the art. IEEE Transactions on [15] Lynch, C.F. and Ashley, K.D. 2014. Empirically valid rules Systems, Man and Cybernetics Part C: Applications and for ill-defined domains. In Proceedings of the Seventh Reviews, 40 (6), 601-618. International Conference on EDM (London, UK, July 4 - 7, [27] Rus, V., Banjade, R., Lintean, M., Niraula, N. and 2014). EDM’14. 237-240. Stefanescu, D. 2013. SEMILAR: A semantic similarity [16] Martinez-Maldonado, R., Yacef, K. and Kay, J. 2013. Data toolkit for assessing students’ natural language inputs. In mining in the classroom: discovering groups’ strategies at a Proceedings of the Sixth International Conference on EDM multi-tabletop environment. In Proceedings of the Sixth (Memphis, TN, USA, July 6 - 9, 2013). EDM’13. 402-403. International Conference on EDM (Memphis, TN, USA, [28] Sheshadri, V., Lynch, C. and Barnes, T. 2014. InVis: An July 6 - 9, 2013). EDM’13. 121-128. EDM tool for graphical rendering and analysis of student [17] McTavish, T.S. 2014. Facilitating graph interpretation via interaction data. In Extended Proceedings of the Seventh interactive hierarchical edges. In Extended Proceedings of International Conference on EDM (London, UK, July 4 - 7, the Seventh International Conference on EDM (London, UK, 2014). EDM’14. 65-69. July 4 - 7, 2014). EDM’14. 59-61. [29] Smith, A., Wiebe, E., Mott, B. and Lester, J. 2014. [18] Merceron, A., Schwarzrock, S., Elkina, M., Pursian, A, SKETCHMINER: Mining learner-generated science Beuster, L., Fortenbacher, A., Kappe, L. and Wenzlaff B. drawings with topological abstraction. In Proceedings of the 2013. Visual exploration of interactions and performance Seventh International Conference on EDM (London, UK, with LeMo. In Proceedings of the Sixth International July 4 - 7, 2014). EDM’14. 288-291. Conference on EDM (Memphis, TN, USA, July 6 - 9, 2013). [30] Vaculik, K., Nezvalova, L. and Popelinsky, L. 2014. Graph EDM’13. 396-397. mining and outlier detection meet logic proof tutoring. In [19] Olson, R., Daily, Z., Malayny, J. and Szkutak, R. 2013. Extended Proceedings of the Seventh International Project CASSI: A social-graph based tool for classroom Conference on EDM (London, UK, July 4 - 7, 2014). behavior analysis and optimization. In Proceedings of the EDM’14. 43-50. Sixth International Conference on EDM (Memphis, TN, [31] Xu, B. and Recker, M.M. 2010. Peer production of online USA, July 6 - 9, 2013). EDM’13. 398-399. learning resources: A social network analysis. In [20] Petersen, K., Feldt, R., Mujtaba, S. and Mattsson, M. 2008. Proceedings of the Third International Conference on EDM Systematic mapping studies in software engineering. In (Pittsburgh, PA, USA, June 11 - 13, 2010). EDM’10. 315- Proceedings of the Twelfth International Conference on 316. Evaluation and Assessment in Software Engineering (Bari, [32] Yamakawa, O., Tagawa, T., Inoue, H., Yastake, K. and Italy, June 26 - 27, 2008). Sumiya T. 2011. Combining study of complex network and [21] Portillo-Rodriguez, J., Vizcaino, A., Piattini, M. and text mining analysis to understand growth mechanism of Beecham, S. 2012. Tools used in Global Software communities on SNS. In Proceedings of the Fourth Engineering: A systematic mapping review. Information and International Conference on EDM (Eindhoven, The Software Technology. 54 (Mar. 2012), 663-685. Netherlands, July 6 - 8, 2011). EDM’11. 335-336. [22] Rabbany Khorasgani R., Takaffoli, M. and Zaiane, O.R. 2011. Analyzing participation of students in online courses