=Paper=
{{Paper
|id=Vol-2650/paper27
|storemode=property
|title=Improving E-Learning Material Quality With the Aid of Deep Learning and Workflow Management
|pdfUrl=https://ceur-ws.org/Vol-2650/paper27.pdf
|volume=Vol-2650
|authors=Melinda Pap,László Zsolt Nagy,Dániel Fekete
|dblpUrl=https://dblp.org/rec/conf/icai3/PapNF20
}}
==Improving E-Learning Material Quality With the Aid of Deep Learning and Workflow Management==
Proceedings of the 11th International Conference on Applied Informatics Eger, Hungary, January 29–31, 2020, published at http://ceur-ws.org Improving E-Learning Material Quality With the Aid of Deep Learning and Workflow Management Melinda Pap, László Zsolt Nagy, Dániel Fekete Eszterházy Károly University, Hungary (pap.melinda,nagy.laszlo.zsolt,fekete.daniel)@uni-eszterhazy.hu Abstract E-learning systems are available since decades, they became widely acces- sible and offered by most collages and universities. It has been recognized in recent years that the future of e-learning lies in crowd-sourced learning, where multiple users can contribute to the global knowledge base. [6, 15] However, in such systems, the question of quality arises. In the development of our e-learning system, one aim was to assure the high quality of the created contents. To achieve this goal, we have incor- porated a workflow-engine, that is commonly used in business intelligence software, to manage business processes. We created a system that provides user interfaces for the definitions of state transitions and the corresponding user permissions. Thus, enabling versatile validation processes for the dif- ferent types of materials, so that a newly created content can become an approved and verified learning object. When we are talking about e-learning quality, we must not forget that this includes the aspects of accessibility. When we enable users to create new content in our system, we want to be sure that it is accessible. On the other hand, we do not wish to put the burden of image description entirely on the users. Therefore, we have incorporated a pre-trained “Show and tell” model [24] to generate natural sentences describing an image, to annotate our image database. This not only improves the accessibility of the system, but also makes it possible for our Elasticsearch search engine to provide relevant search results, even when it comes to images. Deep learning emerged recently as a universal tool for machine learning tasks. The used “Show and tell” model is one of such, a combination of a convolutional neural network (CNN) and a long short-term memory (LSTM) network. Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 260 In this paper we present the architecture of our e-learning system, high- lighting the concepts towards quality assurance. Keywords: E-learning, workflow, deep learning, crowd-sourced learning, image cap- tioning 1. Introduction E-learning systems has been around since decades. By today, they became the part of the everyday life of colleges and universities. It has started as the adaptation of the traditional distance learning model, such that online courses were organized in a standard way, following a specified curriculum in a predetermined pace. With the advances of web-technologies, new trends evolved in the field of e-learning as well. State of the art systems enable the user to conduct an individual learning procedure. Moreover, learners can communicate, collaborate and share content with each-other over the e-learning software. In this study, we focus on a Learning Management System (LMS), more pre- cisely a Learning Content Management System (LCMS). An LMS is used for deliv- ering, tracking and managing training and education. An LCMS is more about the creation of contents and their sharing. These can be hosted in an LMS system as learning objects. It has been pointed out in the literature, that the future of such content creation and management and the development of learning objects lies in crowd-sourcing and machine learning. [18, 15] Crowd-sourcing makes it possible to utilize the intelligence and wisdom of large groups of people in solving problems and creating and validating content. Despite all the advantages, it raises new challenges when it comes to quality control. This is due to the fact that the crowd can be composed of people with varying skills, intellects and objectives [5]. We have recently entered the era of Big Data, which means that way more data is produced than what can be processed by human operators. This is also true in the case of LCMS applications. Machine learning offers a wide range of tools that can help in this matter, especially when it comes to automatically annotating data. This not only aids the searchability of learning objects, but also increases their accessibility. For instance, by helping visually impaired people better understand the content of images. Inclusive education is also gaining attention nowadays and has been addressed in the literature [8, 21]. The goal of this paper is to propose an architectural design for an LCMS system that incorporates building blocks and methodologies responsible for the quality control and accessibility of learning objects. 2. Motivation In the past years, we developed a national e-learning platform called Ekernel in Hungary for elementary schools. In this software, besides the basic LMS func- 261 tionalities, it is possible to create content in the form of online books, interactive exercises, uploaded media, etc. Furthermore, users can search for all types of con- tents and share them with each-other. To enable search engines to find relevant information, we must focus our atten- tion on the annotation of our learning objects, even when they are in the form of images, videos or soundtracks. This not only helps the search, but if done well, helps the accessibility of our educational portal. Since recent advances in the field of deep neural networks, the quality of automated image caption generation increased considerably.[2] In this paper we aim to show how such a tool can be infused in an e-learning architecture. As we opened towards the direction of crowd-sourced content creation, we soon realized that we need to set up a sophisticated protocol to publish the produced data. Our primary target audience is elementary school students, therefore it is of high importance to filter the content according to quality and relevance. We had observed the possibilities and found that we can take examples from the industry where quality control normally plays a crucial role. This lead to the decision to adapt a Workflow Management System (WfMS) [7] to our needs. Such a WfMS enables the description of validation stages, such as professional and linguistic proofreading, plagiarism check etc. In the Ekernel system, we do not only have standalone learning objects, but we allow the creation of lecture notes and com- plete books. In these cases a validation hierarchy is required, when the check of subsequent chapters and lessons is the prerequisite of the final publication. In the proceeding sections, we aim to give an overview of how a WfMS can be integrated in an LCMS architecture in a customizable way. 2.1. Hypothesis In designing the Ekernel system, we aim to give an easily understandable, accessible and intuitive user interface. We put high emphasis on the search capabilities of our system: the global search page is available from the main menu and in addition, we included additional search fields inside all pages of the interactive books. To increase the user satisfaction with usability, all the created pages are respon- sive, thus all content is accessible from mobile devices. This is crucial, knowing that it is highly probable that the students will work on tablets or smartphones during class. Therefore, we expect that students will find the webpage easy to use and easy to understand. We expect that the teachers will be more confident in using this tool during course work, and will feel motivated to contribute to the global knowledge base with their own creations. We also suppose that both teachers and students will find the search functionalities easy to use and will be able to find relevant contents in our Ekernel system. We expect that the quantity and quality of learning objects will make the users more motivated to use our software. 262 Figure 1: Ekernel architectural design 3. Architectural design In Fig. 1. the core architecture of our Ekernel system is illustrated. In designing the system, we put emphasis on using the most recent technologies that came to existence by the innovations of Big Data and semantic web. Due to the versatility of data types used in an LCMS application, we decided to use a scalable noSQL database besides our relational database, that can support large scale and high- concurrency applications with heterogeneous data.[12, 4] The used MongoDB [3] noSQL database is used to store documents and images. To follow the trends of semantic web, we have incorporated an Elasticsearch engine, that is a distributed, scalable, real-time search and analysis engine [10]. It comes with a built-in database to store the indexes. With its aid, the software is capable of indexing learning objects, not only the text-based ones, but also those that have corresponding meta-data given (e.g. keywords). Text based data, and the metadata of all objects is stored in a PostgreSQL relational database. 3.1. Image annotation As shown in Fig. 1. we created a service over our MongoDb, for image annotation. This service is for two main purpose, one is two give keywords for an image, and 263 one is to describe it in a complete, natural language sentence. We have found that several publications have addressed the problem of image captioning before [27, 9, 2, 14, 25, 26, 1]. Due to the lack of computational power and training database with ground truth, we decide to use pre-trained models for our tasks. We chose to use two pre-trained model, one for keyword generation and one for the whole sentence image captioning. For the former, we selected the Incep- tionV3 model [22]. This Convolutional Neural Network (CNN) was trained on the ILSVRC-2012-CLS image classification dataset. For the latter, we chose a “Show and tell” model, also called Neural Image Caption (NIC) provided by the authors of [24]. This method uses a CNN for image representations and a Long Short-Term Memory (LSTM) for generating the final image captions. The CNN is used as an encoder, then the output of its last hidden layer is used as the input of the LSTM decoder. The Flickr 8K/30K, and the MS COCO public image databases were used in the process of training. We created a service over our MongoDb that processes the uploaded images and creates the captions for them. This data is stored in the PostgreSQL database, and is indexed by the Elasticsearch engine in the background. Figure 2: Results of the automated image annotations. Row a) shows the keywords returned by the InceptionV3. Row b) shows the results of the NIC image captioning. Fig. 2. shows the results of image annotations. We can see that the whole sentence captions give sufficient description of an image. In many cases, we found that the captions are more generic and keywords give more insight to an image. For instance, the captioning returns “A group of people riding on top of a boat” and the keyword creating classification gives the exact type of boat that they are ridding: “canoe”. 3.2. WfMS A workflow management system (WfMS) is a software system that helps the setup, execution and monitoring of processes and tasks. The goal of such a system is to increase productivity, agility, improve the information exchange within an organi- zation and assure high quality.[7, 23, 19] Since we do not only operate an LCMS system, but we provide LMS functionalities, we did not limit our workflow design 264 to document processing. The creation and announcement of a course is organized via a workflow as well as the enrollment, evaluation and graduation of a student in our system. During the design of a workflow, one has to take into consideration the follow- ings: 1. execution order of activities, 2. permission-based availability of activities, and 3. permission-based availability of information. In our scenario, we found that we need a rooting type system, that conducts the routing of the flow of information and documents. Figure 3: The database tables used for our flexible WfMS imple- mentation in the Ekernel architecture The database architecture of the proposed highly configurable WfMS is pre- sented in Fig. 3. We start with the creation of a meta table (meta.table info) describing all the existing tables in our PostgreSQL database and assigning id-s to them. We store the types of possible workflows in the workflow type table. Users with admin permissions can define separate workflows for documents, images, book chapters but even for the admission of students to an online course in the system. For each workflow type, one can create stations and define transitions connecting them. Such stations can be: under editing, under professional proofreading, re- jected etc. Both stations and transitions can be constrained, such that only users in a specified permission group can access the data when it is in a certain state or can perform a transition to a new state. All of the executed transitions are stored in the database, so one can trace back the history of any item in the workflow queue. In our system, we enable the users to visualize a specific the workflow as directed graph and configure them via an extensible user interface. Fig. 4 show parts of 265 (a) State transition constraint matrix (b) Definition of sub-processes Figure 4: User interfaces for defining the states and transitions in the Ekernel system. This particular workflow is for the interactive exercises. Figure 5: Graph visualization interface of state transitions the user interfaces. One can define the possible states, then configure the possible transitions via a matrix representation (see Fig. 4a) and set the corresponding privileges. It is possible to create dependent workflows, when the state of one item defines the possible states and transitions of a related item. For instance, if we have a lesson that contains an interactive exercise, we can specify that the lesson can only move to a published state, when the contained exercises are published as well. This particular setup is shown in Fig. 4b. 4. Methodology The Ekernel system was first tested from January to June, 2019 by 310 Hungarian public schools. During this test, 446 teachers and 3706 fifth grade students used our Ekernel system for a period of 5 months. At the end of the test run, we offered the participants to fill out a voluntary questionnaire that addressed their satisfaction with our created LCMS. The basis of the construction of our questionnaire was the theoretical frame- work of user experience. This research framework distinguishes between perceived ergonomic quality, perceived hedonic quality and perceived attractiveness of a prod- 266 uct. [13] The ergonomic quality focuses on the goal and task oriented aspects of product design. High ergonomic quality enables the user to reach his or her goals with efficiency and effectiveness. The focus of hedonic quality is on the non-task oriented quality aspects of a software product, for example the originality of the design or the beauty of the user interface. In the context of the questionnaire it is necessary to consider both pragmatic and hedonic aspects if we want to measure how satisfied users are with a our system. It should allow the users to express feelings, impressions, and attitudes that arise when using our product.[17, 20] The questionnaire consisted of 18 questions for both students and teachers for our user experience assessment. Out of these, 12-12 used a 5 level, and 5-5 used 3-level Likert scale [16], where the scale extended from the “not at all” to the “very much”. To be precise, since we were working with students aged between 10-11, we decided to use Smiley-faced Likert scale [11] for the 12 questions with five levels (illustrated in Fig. 6 ). For the additional 5 questions we used textual answer representation. The questionnaire was filled out online, in our Ekernel system, simultaneously during classwork, supervised by the teachers. Figure 6: Five-level Simley-faced Likert scale used in our survey. The image title was set to corresponding text - in the scale: “not at all” to “very much” - to be shown on mouse hover. The questions in the case of teachers can be organized in 3 main categories: 1. satisfaction and confidence (e.g. “How confident did you feel during teach- ing?’ ’) 2. perceived student satisfaction (e.g. “Did the students understand the material better with the interactive books than with the paper ones? ” , 3. platform usability (e.g. “Did you find what you were searching for easily? Could you navigate where you wanted with ease? ”) The students had another set of questions that we grouped to the following 3 categories: 1. understanding (e.g. “Did you understand the material better with the in- teractive book than with the paper books? ”) 2. motivation and mood, (e.g. “Did you find the materials interesting and exciting? ”) 3. platform usability (e.g. “Did you find what you were searching for easily? ”) 267 (a) Teacher satisfaction (b) Student satisfaction Figure 7: The results of the 5 level Likert scale questionnaire, where the level 1 stands for the “not at all” and level 5. is for “very much”. To be able to filter out those who gave random answers, we included opposing questions as well, one in the 5-level part and its pair in the textual answer part. 5. Empirical findings Until the time of our final survey, our editorial community of 250 users has written and validated 83 books, containing 4 568 lessons. Furthermore, 13 490 interactive exercises were created and 147 612 images were uploaded to our servers. Overall 41 teachers and 258 students participated in our anonymous survey. The answers of 8 participant were filtered out due to inconsistency. The results of the survey are presented in Fig. 7. It shows that 66% of the students claimed that they had a better understanding of the topics with the use of the online, validated materials, than with printed books. Furthermore, 74% of the students and 91% of the teachers answered to the platform usability related questions positively. As a hypothesis, we stated that teachers will be motivated to participate in content creation. To the question ‘‘How much would you like to contribute to the global knowledge base with your own content? ” 59% of them responded positively. In the textual questions part, we asked the participants whether they find this software better than the digital tools they have used so far. In this case, the possible answers were: “no”, “the same” and “yes”. 56% of the teachers and 61% of the students answered “yes”. 6. Future work In the future, we plan to extend our editorial user-base. We would like to allow all the registered users to submit their creations (books, lesson plans, interactive exercises, etc.) for a reviewing process. 268 The image caption also leaves some things to be desired, e.g. higher classifi- cation accuracy has been reported in the literature recently. Fig. 8. shows to examples of complex scenes, where the annotations have low accuracy. The currently used image annotating deep learning models can only operate on natural images, however our image database contains a great number of illustrations as well, thus, we continue our search for networks that can handle such data. Furthermore, we plan to use our image database to fine-tune a classification model. We would also like to continue our progress towards accessibility and create machine learning supported video sub-titling services. Figure 8: Image annotations with low accuracy. 7. Acknowledgements The research was supported by the grant EFOP-3.6.1-16-2016-00001 (“Complex improvement of research capacities and services at Eszterhazy Karoly University”). References [1] Asadi, A. I., and Safabakhsh, R. A deep decoder structure based on wordem- bedding regression for an encoder-decoder based model for image captioning. ArXiv abs/1906.12188 (2019). [2] Bai, S., and An, S. A survey on automatic image caption generation. Neurocom- puting 311 (2018), 291–304. [3] Banker, K. MongoDB in action. Manning Publications Co., 2011. [4] Cattell, R. Scalable sql and nosql data stores. SIGMOD Rec. 39, 4 (May 2011), 12–27. [5] Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B., and Allah- bakhsh, M. Quality control in crowdsourcing: A survey of quality attributes, as- sessment techniques, and assurance actions. ACM Computing Surveys (CSUR) 51, 1 (2018), 1–40. [6] Downes, S. E-learning 2.0. Elearn magazine 2005, 10 (2005), 1. 269 [7] Du, W., and Elmagarmid, A. Workflow management: State of the art vs. state of the products. HP LABORATORIES TECHNICAL REPORT HPL (1997). [8] Ferretti, S., Mirri, S., Muratori, L. A., Roccetti, M., and Salomoni, P. E-learning 2.0: you are we-lcome! In Proceedings of the 2008 international cross- disciplinary conference on Web accessibility (W4A) (2008), ACM, pp. 116–125. [9] Fu, K., Jin, J., Cui, R., Sha, F., and Zhang, C. Aligning where to see and what to tell: Image captioning with region-based attention and scene-specific contexts. IEEE transactions on pattern analysis and machine intelligence 39, 12 (2016), 2321– 2334. [10] Gormley, C., and Tong, Z. Elasticsearch: the definitive guide: a distributed real-time search and analytics engine. " O’Reilly Media, Inc.", 2015. [11] Hall, L., Hume, C., and Tazzyman, S. Five degrees of happiness: Effective smiley face likert scales for evaluating with children. In Proceedings of the The 15th International Conference on Interaction Design and Children (2016), pp. 311–321. [12] Han, J., Haihong, E., Le, G., and Du, J. Survey on nosql database. In 2011 6th international conference on pervasive computing and applications (2011), IEEE, pp. 363–366. [13] Hassenzahl, M. The effect of perceived hedonic quality on product appealingness. International Journal of Human-Computer Interaction 13, 4 (2001), 481–499. [14] Hossain, M. Z., Sohel, F., Shiratuddin, M. F., and Laga, H. A comprehensive survey of deep learning for image captioning. ACM Comput. Surv. 51, 6 (Feb. 2019). [15] Hussain, F. E-learning 3.0= e-learning 2.0+ web 3.0?. International Association for Development of the Information Society (2012). [16] Jamieson, S., et al. Likert scales: how to (ab) use them. Medical education 38, 12 (2004), 1217–1218. [17] Laugwitz, B., Held, T., and Schrepp, M. Construction and evaluation of a user experience questionnaire. In Symposium of the Austrian HCI and Usability Engineering Group (2008), Springer, pp. 63–76. [18] Morris, R. D. Web 3.0: Implications for online learning. TechTrends 55, 1 (2011), 42–46. [19] Pesic, M. Constraint-based workflow management systems : shifting control to users. PhD thesis, Department of Industrial Engineering and Innovation Sciences, 2008. Proefschrift. [20] Rauschenberger, M., Schrepp, M., Pérez Cota, M., Olschner, S., and Thomaschewski, J. Efficient measurement of the user experience of interactive products. how to use the user experience questionnaire (ueq). example: Spanish language version. [21] Rogers-Shaw, C., Carr-Chellman, D. J., and Choi, J. Universal design for learning: Guidelines for accessible online instruction. Adult Learning 29, 1 (2018), 20–31. [22] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. Rethinking the inception architecture for computer vision. CoRR abs/1512.00567 (2015). [23] Van Der Aalst, W., Van Hee, K. M., and van Hee, K. Workflow management: models, methods, and systems. MIT press, 2004. 270 [24] Vinyals, O., Toshev, A., Bengio, S., and Erhan, D. Show and tell: Lessons learned from the 2015 mscoco image captioning challenge. IEEE transactions on pattern analysis and machine intelligence 39, 4 (2016), 652–663. [25] Wu, Q., Shen, C., Wang, P., Dick, A., and van den Hengel, A. Image captioning and visual question answering based on attributes and external knowledge. IEEE transactions on pattern analysis and machine intelligence 40, 6 (2017), 1367– 1381. [26] Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., and Bengio, Y. Show, attend and tell: Neural image caption genera- tion with visual attention. In International conference on machine learning (2015), pp. 2048–2057. [27] You, Q., Jin, H., Wang, Z., Fang, C., and Luo, J. Image captioning with semantic attention. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 4651–4659. 271